This piece was originally posted to the Personality Interest Group and Espresso (PIG-E) web blog at the University of Illinois.
As of late, psychological science has arguably done more to address the ongoing believability crisis than most other areas of science. Many notable efforts have been put forward to improve our methods. From the Open Science Framework (OSF), to changes in journal reporting practices, to new statistics, psychologists are doing more than any other science to rectify practices that allow far too many unbelievable findings to populate our journal pages.
The efforts in psychology to improve the believability of our science can be boiled down to some relatively simple changes. We need to replace/supplement the typical reporting practices and statistical approaches by:
- Providing more information with each paper so others can double-check our work, such as the study materials, hypotheses, data, and syntax (through the OSF or journal reporting practices).
- Designing our studies so they have adequate power or precision to evaluate the theories we are purporting to test (i.e., use larger sample sizes).
- Providing more information about effect sizes in each report, such as what the effect sizes are for each analysis and their respective confidence intervals.
- Valuing direct replication.
It seems pretty simple. Actually, the proposed changes are simple, even mundane.
What has been most surprising is the consistent push back and protests against these seemingly innocuous recommendations. When confronted with these recommendations it seems many psychological researchers balk. Despite calls for transparency, most researchers avoid platforms like the OSF. A striking number of individuals argue against and are quite disdainful of reporting effect sizes. Direct replications are disparaged. In response to the various recommendations outlined above, prototypical protests are:
- Effect sizes are unimportant because we are “testing theory” and effect sizes are only for “applied research.”
- Reporting effect sizes is nonsensical because our research is on constructs and ideas that have no natural metric, so that documenting effect sizes is meaningless.
- Having highly powered studies is cheating because it allows you to lay claim to effects that are so small as to be uninteresting.
- Direct replications are uninteresting and uninformative.
- Conceptual replications are to be preferred because we are testing theories, not confirming techniques.
While these protestations seem reasonable, the passion with which they are provided is disproportionate to the changes being recommended. After all, if you’ve run a t-test, it is little trouble to estimate an effect size too. Furthermore, running a direct replication is hardly a serious burden, especially when the typical study only examines 50 to 60 odd subjects in a 2×2 design. Writing entire treatises arguing against direct replication when direct replication is so easy to do falls into the category of “the lady doth protest too much, methinks.” Maybe it is a reflection of my repressed Freudian predilections, but it is hard not to take a Depth Psychology stance on these protests. If smart people balk at seemingly benign changes, then there must be something psychologically big lurking behind those protests. What might that big thing be? I believe the reason for the passion behind the protests lies in the fact that, though mundane, the changes that are being recommended to improve the believability of psychological science undermine the incentive structure on which the field is built.
I think this confrontation needs to be more closely examined because we need to consider the challenges and consequences of deconstructing our incentive system and status structure. This, then begs the question, what is our incentive system and just what are we proposing to do to it? For this, I believe a good analogy is the dilemma faced by Harry Potter in the last book of the eponymously titled book series.Read more...