To those used to this year’s significant variation in polling results for different contests, the latest batch of contradictory surveys may not seem different. But what’s happening now largely reflects the switchover most pollsters have completed from the use of less to more selective samples (or in some cases, the same samples with new weighting or adjustment factors) as part of an effort to determine likely voters.
Nate Silver’s got a good summary of the wild variations produced by different LV models in terms of the generic congressional ballot:
Just this past weekend, for instance, a Newsweek poll showed Democrats 5 points ahead among registered voters — already a good number for them — but with a larger lead of 8 points among likely voters (Newsweek calls them “definite voters”, but it’s basically the same thing ). That is, it showed a 3-point likely voter gap in the Democrats’ favor. By contrast, as we noted, the Gallup poll shows as much as a 15-point swing in Republicans’ favor when a likely voter model is applied.
Mark Blumenthal has published a very good basic primer on why LV numbers differ so much from each other, and from other measurements of the electorate. He begins by presenting the most famous model, that used by Gallup, which combines a series of questions to poll respondents about their intent to vote and their past voting history, with an adjustment based on an overall estimate of turnout. Blumenthal then notes the other best-known approaches:
* The CBS/New York Times variant, which is similar to the Gallup approach except that rather than select specific respondents as likely voters, it weights all registered voters up or down based on their probability of voting.
* The use of two or three questions to simply screen out voters at the beginning of the interview that say they are not registered and not likely vote.
* The application of quotas or weights to adjust the completed interviews to match the pollster’s expectations of the demographics or regional distribution of likely voters.
* The application of quotas or weights to match the pollster’s expectations of the party affiliation of likely voters. I break this one out separately because it remains among the most controversial likely voter “modeling” tools.
* Sampling respondents from lists that draw on official records of the actual vote history of individual voters, so that when the pollster calls John Doe, they already know whether Doe has voted in past elections.
* Finally, many believe that the use of an automated, recorded-voice methodology rather than a live interviewer is itself a useful tool in obtaining a more accurate measurement of the intent to vote.
Hardly just technical differences in these approaches, eh? And without impugning anyone’s motives, it should be obvious that LV models that depend on imposing some sort of expectation about the partisan composition of the electorate could nicely coexist with partisan bias.
In any event, most LV models tend to converge a bit and become more accurate as election day approaches and registered voters make up their minds whether to participate. At present, though, it’s important to have some idea about how individual pollsters determine likelihood to vote, and how that might reflect the results. . .