Thursday, 18 September 2003

The state of the art (of polling)

The California Recall has prompted a few questions about various polling techniques. As someone who’s put in his fair share of hours doing telephone survey research, and has heard a version of the “pitch” from Harris Interactive from one of their in-house statisticians*, I thought I’d try to clear up some confusion.

The “traditional” way of doing political polling these days is a system called “random digit dialing.” Basically, to get the number of respondents they need, professional pollsters call several thousand households from a list of residential numbers prepared by companies like Survey Sampling Inc.; if you’re feeling cheap, there are other alternatives that can be used (with a much higher non-response rate). (Before RDD, we did stuff like what Zogby did in Iraq recently; that sort of quasi-random “man on the street” interviewing is common in non-industrialized countries, and essentially the same as contemporary exit polling in the United States.)

RDD worked pretty well for polling until computers arrived on the scene in the mid-80s along with the hardcore telemarketing industry. In the past two decades, response rates have dropped off sharply, requiring more calls to get a valid sample for statistical inference. Coupled with answering machines and caller ID, the effectiveness of RDD for getting a truly random sample has been undermined.

The Internet allows a few new options. Internet survey delivery allows respondents to complete surveys at their own convenience, and also permits the delivery of non-verbal stimuli (like photographs, long blocks of text to be read, and drawings), which is useful for experimental designs. The drawback is that just sticking a survey on the Internet will result in a non-random sample, the most notorious instance of which is the abomination known as the “web poll.” Since respondents to web polls self-select, we have no idea how representative they are of the public at large.

Two groups in the U.S. have tried to tackle the non-random response issue from different directions. Knowledge Networks (KN) solves the representativeness problem by only offering the surveys to a randomly-selected sample of households. Rather than recruiting a new batch of respondents for each survey (like in a traditional phone survey), KN has a rolling panel of several thousand households that participate in studies. They are provided with free WebTV service for the duration of their panel membership, and in exchange must participate in a certain number of surveys. The surveys are delivered via WebTV to the household. (This approach is basically the same as that employed by the Neilsens for television ratings.) As in a traditional phone survey, some weighting is done to adjust the sample to account for stratification and clustering effects. KN’s co-founders are Stanford University professors Norman Nie and Douglas Rivers; Stanford apparently has an arrangement for reduced-cost surveys with KN due to this relationship (at least judging from the number of Stanford professors and graduate students I see at conferences using KN-based experimental and survey data).

The other approach, employed by Harris Interactive, is to do post-hoc adjustments through a technique called “propensity weighting.” Harris has a truly Internet-based panel with a larger membership than KN’s panel (some of the difference in membership size is due to Harris also doing survey work outside the United States; however, they also use bigger samples for each survey for other reasons which I’ll get to shortly). Surveys are administered via the user’s web browser in response to invitations, and participants receive points for participating in surveys and also get entries in regular drawings for cash prizes. Instead of ensuring that participants are representative of the population at large, Harris uses propensity weighting to reweigh respondents based on their demographic and behavioral characteristics and the frequency of those characteristics in the population at large (weighting schemes for other survey techniques are generally based on the design of the sampling procedure). It is important to emphasize that Harris’ technique is not based on random samples. However, propensity weighting is designed to make the sample behave “as if” it was selected randomly.

Which technique is better? All of them have flaws, particularly if trying to reach certain subpopulations like the homeless and indigent (Harris’s technique might find the occasional homeless guy who checks his email at the library; KN and RDD would never catch him). For voting research, however, all of the techniques would probably fare better. Generally speaking participation is correlated with the variables that would be associated with having a telephone, a stable household, and Internet access. To the extent that some population groups are less likely to be online, propensity weighting should adjust for that (in the case of Harris).

Earlier this year, Political Analysis had an article that compared all three techniques, which found that generally RDD, KN, and Harris provided estimates of population parameters within the reported margin of error, with a few notable exceptions. For inferential statistics (trying to figure out the relationships among variables), which is generally what political scientists are interested in, the sampling issues are relatively unimportant, but for the descriptive statistics (trying to figure out what the population-at-large is like) pollsters and the media care about, there may be more important issues that weren’t addressed in the PA piece.

But generally both KN and Harris appear to have credible techniques that have been backed up with actual election results, so their conclusions are as likely to be correct as those of traditional surveys like the Field Poll and L.A. Times.

* Disclaimer: I got a free set of noise-cancelling headphones for answering a few dozen Harris surveys. I have received nothing from the Leland Stanford Junior University except a rejection letter when I applied to them when I was in high school and a free campus tour from a total ditz named “Mei Mei.” The joys of a photographic memory.