Saturday, 7 February 2009

Repurposed content

Herein I present a rant on one-tailed tests in the social sciences; feedback welcome:

Unless you have a directional hypothesis for every coefficient before your model ever makes contact with the data, you have no business doing a one-tailed statistical test. Besides if your hypotheses are solid and you have a decent n, the tailedness shouldn’t determine significance/lack thereof.

Thought experiment: assume you present a test in a paper that comes out p=.06, one-tailed. That means you have a hypothesis that doesn’t really work to begin with (sorry, “approaches conventional levels of statistical significance”). More importantly, if you just made up the tailedness hypothesis post facto to put a little dagger (or heaven forbid a star) next to the coefficient, you really did a two-tailed test with p=.12 and then post-hoc justified it to make the finding sound better than it really was.

Now here’s the center of the rant: I really don’t believe you actually knew the directionality of your hypothesis before you ran the test and were willing to stick with it through thick and thin, since I know that you’d be figuratively jumping up and down with excitement and report a significant result if the “sign was not as expected” and it came out p=.003 two-tailed (p=.0015 one-tailed, opposite directionality), rather than lamenting how it turned out with p=.9985 on your original one-tailed test. I dare say nobody has ever published an article claiming the latter (although I might give it a positive review just for kicks).

And I really don’t feel the need to have these discussions with sophomores and juniors, hence why I prefer books that just talk about two-tailed tests (aka “not Pollock” [a textbook I really like otherwise]) so I don’t feel the need to rant.

See also: this FAQ from UCLA, which is a little more lenient—but not much.

Wednesday, 14 September 2005

Research methods exercise of the day

I had seven groups in class today do the following: come up with a way to test whether peoples’ blaming of the government for an inadequate response to Hurricane Katrina was affected by media coverage.

I think I had about ten answers. Which is as it should be, letting many flowers bloom and all that, and which goes to show that a seemingly simple question can be answered by social science in lots of different ways—sometimes with different answers. One strongly suspects the group that would have exposed different experimental groups to Shep and Anderson Cooper would have found a bit different results than the group that measured self-reported media attentiveness in a sample survey.