More Staring At Empty Pages articles
AllIn the issue dated 10 October, Science News reports on a study that suggests that peer reviewers prefer positive results:
Peer reviewers for biomedical journals preferentially rate manuscripts with positive health outcomes as better, a new study reports.
Now, at first blush this might seem like a “Duh!” moment, but it’s not. We obviously would like to see positive results when we’re studying a new medicine, but there’s a great deal of value in publishing negative results as well. It tells us what medicines don’t work. It tells researchers what direction to take, exposing some of the blind alleys. It’s critically important information, and that’s true in other fields, as well.
Consider studies of acupuncture, of astrology, and so on. There are a great many people who think those work — certainly enough to warrant a serious look with controlled studies. And controlled studies have been done. They show that astrology doesn’t work at all. They show that acupuncture works as a placebo: “fake” acupuncture is as effective as “real” acupuncture. These are useful results.
Consider herbal remedies: we know that herbs do have active substances in them, but there are lots of claims and we’d like to sort them out. Does Ginkgo biloba help with dementia? Does Echinacea reduce cold infections? Is Valerian effective as a sedative? Does St John’s wort work against depression? Studies say no, no, maybe, and yes, respectively. And the “no” results are arguably just as important as the others.
But it’s not just in medicine that we see a preference for favourable results. It’s true in my field of computer research, as well. In fact, while it would be quite important to see, say, methods of spam filtering that seemed like good ideas but fell flat, we rarely see people submitting them, and I’m quite certain that reviewers would lean toward rejecting them in favour of “more interesting” papers with “better” results.
Probably one significant reason for the lack of submissions is that people aren’t eager to document “failure”. That means that it’s incumbent upon the review and publication system to actively encourage the publication of good ideas that didn’t work out. The “good ideas” part of it is key, of course: there’s plenty of work that went nowhere, but that wasn’t promising to begin with. There’s limited value in publishing that stuff.
On the other side, reviewers should be looking at the value a study has for teaching or for directing future work, and for confirming or overturning common theories. A paper that shows definitively that something we expected to work doesn’t... is arguably more important than one with a partial result in the expected direction.









But one consideration relates to the logic of hypothesis-testing, which renders unsupported hypotheses uninterpretable (ie, you can't support the null). Technically studies that failed to find, say, the benefits of acupuncture, did just that; they didn't find that acupuncture is bunk per se. While the latter is one interpretation, giving heed to the alternative (ie, there is a relation, it just wasn't supported) is important as a means of honing research methods. It should push scientists to re-examine the situation and (for lack of a better word) take responsibility for constructing a better study.
In the case of acupuncture, while placebo is always an option, still we see people continuing to use it and to claim benefits from it; researchers who failed to support its benefits might hone their direction in any number of ways, be it their selection criteria (eg, focusing only on people with problems that conventional medicine couldn't solve; breaking results down by ethnicity), outcome measures (eg, making them broader or narrower), or manipulation.
Science is in large part based on observation and quantification; human error always being a given, there are still many common day associations that science finds difficult to quantify (such as fatigue and MS). I'm with you with in placing more of an emphasis on unsupported results, but noting that failure to find a relation may be less of a reflection of the construct rather than of accepted methodology.