Almost 30 years ago (1988), I published a paper in the Journal of Parapsychology titled “Successful Replication versus Statistical Significance” in which I argued against the use of the standard “p ≤ .05″ as the criterion for judging the success of an experiment. I pointed out the problems with p-values that statisticians were well aware of even then, but many scientists (and journal editors) are only now beginning to understand, such as the role of sample size in determining statistical significance. The paper generated substantial discussion, and at the Parapsychological Association annual conference that year, someone distributed T-shirts to support my point of view.
For the past year, the ASA has had a committee working on elucidating principles that should accompany the use of p-values. I asked Ron Wasserstein, ASA’s executive director, to answer some questions about how this came about.