Thursday, 27 May 2004

Vaguely tenurable activity

Here’s a brief article on Quantian (a “live Linux” CD with lots of scientific and mathematical goodies on it) I’m working on with a fellow Debianista for submission to The Political Methodologist, our humble little organized section newsletter. Any comments or feedback would be appreciated.

Friday, 7 May 2004

You were just a waste of time

Josh Chafetz asks the $64,000 question in American public opinion polling:

[W]hat, exactly, is the point of continually doing nationwide polls when all that matters are the states? I mean, I know nationwide polls are a lot cheaper, but just making up the results would be cheaper still, and only marginally more relevant.

Well, I don’t know that nationwide polling is truly irrelevant; the state-level poll results would be close to a simple linear function of the national polling number, although the effects of campaign advertising—concentrated in the “battleground states”—will cause divergence from linearity.

But due to statistical theory, and the closeness of presidential elections, you’d have to survey a lot more people to get accurate state-level data… realistically, a sample less than 500 per state is useless, which means polling 25,500 people (including D.C.) per survey—and you’re still getting a sampling error of ±4.5% per state. So the best that you can do is pretty much what’s done in practice—you do national surveys augmented by state-level surveys in states that are a priori believed to be close.

Friday, 30 April 2004

Average Students

Len Cleavlin has a classic example (albeit probably apocryphal) of the dangers of the arithmetic mean:

There’s an old joke that the Geography Department at the University of North Carolina would tell prospective majors the average salary of graduates with a bachelor’s degree in geography from UNC, without telling them that UNC alumnus and NBA star Michael Jordan received his bachelor’s in geography….

Ole Miss’ criminal justice department might consider employing this trick, as New Orleans Saint Deuce McAllister was a CJ major (though I’m damned if I know whether or not he actually graduated); even given the number of CJ majors who’ve matriculated, Deuce’s NFL salary would probably bump up the mean by a few grand.

Inductive reasoning

David Pinto doesn’t think much of ESPN’s continued plugging of its “productive outs” statistic, and in particular Buster Olney’s justification thereof, as summarized by Pinto:

The basic argument is: here’s a stat, this team is good at it, this team won, so it must be important to be good at that stat.

I’m all for inductive reasoning, but inductive reasoning from a single case is, generally speaking, not a very smart idea…

In other baseball news, Ole Miss finally got off the schnide with a 2–1 victory over Murray State on Wednesday (snapping a six-game losing streak); let’s hope they can stay on track this weekend at South Carolina.

Wednesday, 14 April 2004

Misery loves company

Dan Drezner takes a look at John Kerry’s “new and improved!” misery index:

Every index can be challenged on the quality of the data that goes into it, and the weights that are assigned to the various components that make up the overall figure. A lack of transparency about methodology is also a valid criticism. For example, in my previous post on the competitiveness of different regions in the global information economy, the company responsible for the rankings provides little (free) information on how the index was computed. That’s a fair critique.

Even when the methodology is transparent, there can still be problems.

This is a subject near-and-dear to my heart. In quantitative social science, your econometric model is only as useful as your indicators; a crappy indicator renders the whole model essentially useless.

Unfortunately, our ways of dealing with the problem of how well an indicator reflects a concept leave a lot to be desired; “face validity”—which boils down to “I think the indicator reflects the concept, so we’ll a priori assume it does”—is relied on, even by good scholars, to an extent that will make you blanch. Even seemingly obvious indicators, like responses to survey questions, are often woefully inadequate for measuring “true” concepts (in the case of public opinion research, attitudes and predispositions).

Building an index helps with some of these problems—if your measurement error—but introduces others (like ascribing valid weights to the items, as Dan points out). A few cool tools, like factor analysis and its cousin principal components analysis, are designed to help in finding weights, but even they have problems and limitations, most of which basically boil down to the fact that human judgment is still involved in the process.

Thursday, 25 March 2004

Fault line

Tyler Cowen points out new research that indicates no-fault divorce laws have led to lower levels of domestic violence and suicide among women.

Thursday, 11 March 2004

Virginity pledges not kept; news at 11

James Joyner links this NYT piece with the snarky comment:

I await the study that investigates New Years resolutions.

I tend to agree with commenter “steve,” who writes:

Too bad there are some in this country who want to make so called virginity pledges part of serious public policy. When serious people call for new years resolutions in order to solve serious socail [sic] problems your point will stand.

But I think there’s an interesting question here: why aren’t many of the pledges kept? I suspect it has to do with peer pressure: students who don’t sincerely want to make virginity pledges are pressured into them by religious groups they are affiliated with, parents, or friends. And, in general, people don’t keep pledges when there’s no effective sanctioning mechanism to ensure fealty to them; unless you’re female and get knocked up, nobody’s going to know whether or not you actually kept a virginity pledge.

That said, one other part of the study, as reported in USA Today, seemed a bit puzzling:

The study also found that in communities where at least 20% of adolescents pledged the STD rates for everyone combined was 8.9%. In communities with less than 7% pledgers, the STD rate was 5.5%.

Not only is this a massive ecological inference problem (there’s absolutely no way to show causality here), the causal mechanism doesn’t even function right: adolescents are a relatively small part of the population, dwarfed by the sexually active adult population. Nor is there any test of whether the pledge rate affects STD rates over time—which at least might get at the question of whether pledges have some aggregate effect on STD incidence. Most odd.

Anyway, I tend to agree with critics that government-led efforts to encourage abstinence—a cornerstone of both the Bush and Clinton administrations’ “sex ed” policy*—are likely to be completely ineffective, if not counterproductive, in reducing teen pregnancy and STD transmission. The feds should find something better to waste our money on instead…

Sunday, 7 March 2004

Coin-toss bias

Robert Garcia Tagorda, Christopher Genovese, and Alex Tabarrok today take note of this article in Science News, which indicates that 3 researchers have found that coins, when tossed, land the same way up they started about 51% of the time.

Why hasn’t this been discovered in practice before? Interestingly, the article discusses a previous experiment with coin tossing that didn’t discover any bias:

During World War II, South African mathematician John Kerrich carried out 10,000 coin tosses while interned in a German prison camp. However, he didn’t record which side the coin started on, so he couldn’t have discovered the kind of bias the new analysis brings out.

Kerrich most likely didn’t discover the bias because some other part of his coin-tossing procedure ensured randomness. And, indeed, in a large number of trials, if there’s no bias in the starting condition (approximately equal numbers of coins are “heads” or “tails” when tossed), there will be no bias in the aggregate result—even given this finding.*

More to the point, the practical value of this finding seems minimal. The most obvious application—wagering—is precluded because no casino game that I’m aware of uses coin flips, though it’s possible that the ball in roulette and dice in craps may be similarly biased—again, given a known starting position, something that is rare in roulette at least (as the ball is under the control of the casino staff rather than the wagerers).

More on NOMINATE

James Joyner isn’t quite convinced of Jeff Jenkins’ argument that John Kerry is more conservative relative to Democratic presidents (historically) than George Bush is liberal, using Keith Poole and Howard Rosenthal’s NOMINATE method. James writes:

The problem I have with Poole’s coding methodology is that it’s excessively time bound. To compare Bush 43 to Reagan or Kerry to Carter ignores massive shifts in public opinion during those time periods. The “center” is not a spot on a map; it’s a median of current attitudes.

There are actually two versions of Poole and Rosenthal’s methodology. The version Jenkins apparently used for his analysis (from the description in the article) is called W-NOMINATE, and only looks at a particular Congressional session (e.g. the 107th Congress, from 2001 to 2003). There’s a second version, called DW-NOMINATE, that allows comparisons over time between Congresses. In other words, using W-NOMINATE is inappropriate for comparisons over time.* James goes on to write:

I’d think the ACU/ADA ratings are much more useful than Poole’s, since the comparison is made against one’s contemporaries.

Actually, ACU and ADA ratings are essentially interchangeable with W-NOMINATE first dimension scores. But I think James is critiquing Jenkins for something that Jenkins actually didn’t do (even though the article might lead you to think he did).

It seems to me there are two related questions here: is Bush more extreme than Kerry? and, are Bush and Kerry more extreme relative to their partisan predecessors? The first question was pretty clearly answered by Jenkins in the article. The second can’t be answered by the W-NOMINATE method that Jenkins used—which, given his indication that he deliberately simplified the analysis (by using W-NOMINATE instead of DW-NOMINATE), makes it seem odd that he tried to make comparisons over time. The question I think Jenkins answered is “are Bush and Kerry more extreme relative to predecessor presidents vis à vis the Congresses they faced”—and, for that comparison, W-NOMINATE or ADA/ACU scores would work equally as well.

Update: Jeff Jenkins has a comment at Dan’s place that clarifies the situation; he did use DW-NOMINATE for the interyear comparisons, but that point was lost in the editing process. So ignore the above paragraph. ☺ He has some interesting points too in regard to Poole and Rosenthal’s book, Congress: A Political-Economic History of Roll Call Voting.

Also worth pointing out is the forthcoming APSR piece by Doug Rivers, Josh Clinton, and Simon Jackman, “The Statistical Analysis of Roll-Call Data”. There's also a recent issue of Political Analysis in which all of the articles were on ideal-point estimation (which is the technical term for NOMINATE and the Rivers-Clinton-Jackman approach). And, if you want to do it yourself, Andrew Martin and Kevin Quinn have included the Rivers-Clinton-Jackman procedure in their MCMCpack package for GNU R.

I previously discussed Kerry’s ideology here. Dan Drezner also discusses the article in question here.

Thursday, 4 March 2004

How liberal is John Kerry?

Tom Maguire suggests that the National Journal finding that John Kerry is the most liberal member of the Senate isn’t supported by Poole and Rosenthal’s NOMINATE scores, at least not over the last two Congresses. He also quibbles:

Any fool can ask a question that ten wise men cannot answer: Dr. Poole bases his rankings on all recorded roll call votes, including the straight party-line organizational votes – for example, all Republicans voted for Bill Frist as Leader, and for the various Republican committee chairpersons. My suspicion is that the results give a good ranking within parties (so Kerry is really a centrist Dem), but the border between Republican and Democrat on substantive votes is blurrier than these results suggest. Objectivity and simplicity might suffer, but has this been looked at?

My (admittedly fuzzy) recollection of NOMINATE is that the results are fairly robust when you exclude pure party-line votes from the input data. A second approach to this question is a recent paper (released Monday!) by Joshua Clinton, Simon Jackman, and Doug Rivers that uses a Bayesian item-response theory model to approach the question (the same method used in their forthcoming APSR piece, a variant of which I used to measure political knowledge in my dissertation); the abstract follows:

We reanalyze the 62 key Senate roll calls of 2003, as identified by National Journal, using a statistical procedure that (1) is sensitive to different rates of abstention across senators and roll calls; (2) allows us to compute margins of errors on voting scores and the ranks of the legislators, as well as compute the probability that a given senator occupies a particular rank (e.g., is the “most liberal” senator). The three Democratic senators running for president in 2003 have markedly higher rates of abstention than the rest of the Senate, leading to considerable uncertainty as to their voting score (particularly for Senator Kerry). In turn, we find that contrary to recent media reports, Senator Kerry (D-MA) is not the “most liberal” senator, or at least not unambiguously; as many as three Senators could plausibly be considered the “most liberal“, with Kerry third on this list behind Senators Reed (D-RI) and Sarbanes (D-MD).

The note lacks any high-powered math, and should be accessible to anyone with an interest in politics and a modicum of statistical knowledge. Incidentally, their method does show a closer overlap between Democrats and Republicans than NOMINATE does (in part because they restricted the analysis to 62 “key” votes rather than all of the roll calls). One other thing to note: the whopping error bar around Kerry’s position, a direct result of his absenteeism from the Senate over the past year.

Thursday, 26 February 2004

Central tendency

Vance of Begging to Differ takes issue with FactCheck.org’s claim that the Bush administration’s claim that the average tax cut is $1,586 is “misleading,” because using the mean instead of the median† is improper. Vance writes:

I can think of a valid justification for either measure. If you’re trying to understand the overall economic effects of the tax cuts, for example, an average is entirely applicable.

In the case where data is “normally distributed”—following the “bell curve” known to statisticians—the mean and the median are essentially the same.* When they differ, the data is said to be skewed, and measures of central tendency and dispersion that assume a normal distribution (like the mean) are generally misleading, as they don’t properly describe the distribution. The income distribution, for example, is skewed right.‡

To cast things in non-mathematical terms, when people think about averages they are thinking in terms of things that are most typical, rather than in terms of distributions. And, in general, the median better reflects this perception of average than the mean. While there may be technical value to the mean for specialists and those who want to engage in further analysis, I think the median does a better job of reflecting the “most typical” observation in most data patterns.