A new report released this week by the National Academies of Sciences, Engineering, and Medicine is weighing in on a contentious debate within the science world: the idea that scientific research is fundamentally flawed, rife with published findings that often can’t be reproduced or replicated by other scientists, otherwise known as the replication and reproducibility crisis.
Their report, the collaborative work of more than a dozen experts from universities and the private research world, doesn’t go so far as to call it a crisis. But it does call for wide-scale improvements in how scientists do their work, and it also takes scientists—and journalists—to task for sometimes overhyping the latest research findings.
For years now, some scientists have sounded a clarion call about the overall quality of published research. Common issues highlighted by these scientists have included fraudulent, poorly done, or overhyped studies, with embellished findings based on small sample sizes; statistical manipulation of a study’s results during or after the experiment is over to achieve a desired outcome; and studies with negative conclusions being suppressed by their authors or rejected by scientific journals, which can then skew the medical literature on a particular topic, such as a drug’s effectiveness.
The most glaring fallout from these issues has been that many of the most influential or flashy findings in science, particularly in psychology, fail to be reproduced (meaning that other researchers can’t get the same findings by crunching the same raw data obtained by the original study) or replicated (meaning that other researchers, when recreating the design of the original study, don’t get similar results).
In some surveys, a majority of scientists have agreed that science is facing a legitimate problem, and initiatives have sprouted up to start fact-checking widely accepted hallmark studies. But prominent researchers have accused these watchdogs of “methodical terrorism,” while aggrieved scientists whose work has been thrown into question have lashed back, accusing their critics of malicious motives. On the other end, researchers such as John Ioannidis (an early voice in this debate) have gone as far as to argue that most published findings are false.
It’s this battleground that the National Academies—one of the world’s leading and most trusted scientific organizations—is stepping into with its latest report. And it seems to strike a middle ground between the two fronts.
Even as it states that there are serious systemic gaps in how scientists conduct and relay their research, it doesn’t necessarily agree that there is a true “crisis” threatening science, even in the public’s eye.
“The advent of new scientific knowledge that displaces or reframes previous knowledge should not be interpreted as a weakness in science,” the report authors wrote. “Scientific knowledge is built on previous studies and tested theories, and the progression is often not linear. Science is engaged in a continuous process of refinement to uncover ever-closer approximations to the truth.”
At the same time, it offers a way forward for scientists, policy makers, and even the media, providing guidelines for better data transparency and rigor in original studies; criteria for when these studies might merit a reproduction or replication; and recommendations for how journalists should cover and report on these studies.
One glaring problem in science brought up by the report, for example, is that many studies don’t or can’t make available the full data needed for other researchers to reproduce their findings. Scientists are also misusing statistical tools like the p-value (a threshold, usually 0.05, used to convey if a finding is statistically significant). A study with a p-value less than 0.05 means the study’s findings would be implausible if the scientists’ expected predictions were wrong (i.e., the null hypothesis). In other words, p-value is supposed to help tell us whether or not a study’s results are a fluke. But it doesn’t directly prove whether a drug does what it’s meant to do, for instance, nor does it tell us if a treatment is meaningfully, clinically effective in the real world.
That said, the report also notes that the American public’s confidence in science hasn’t wavered at all in recent years, despite major news articles discussing the “crisis” in psychology and elsewhere. And it found that even scientists who have criticized the current state of things aren’t completely on-board with calling science broken.
“How extensive is the lack of reproducibility in research results in science and engineering in general? The easy answer is that we don’t know,” Brain Nosek, co-founder and director of the Center for Open Science, told the report committee during a panel last year. “I don’t like the term ‘crisis’ because it implies a lot of things that we don’t know are true.”
The committee, while calling for a streamlined approach to replication and reproducibility studies as well as the better storage and availability of datasets for these studies to happen, also concludes that scientists shouldn’t worry too much about replicating any one individual study.
“A predominant focus on the replicability of individual studies is an inefficient way to assure the reliability of scientific knowledge,” they wrote. “Rather, reviews of cumulative evidence on a subject, to assess both the overall effect size and generalizability, is often a more useful way to gain confidence in the state of scientific knowledge.”
Of course, it isn’t just research methods that could stand to improve. The report also singles out journalists, citing a survey showing that 73 percent of Americans agree that the “biggest problem with news about scientific research findings is the way news reporters cover it.”
The report recommended that journalists “should report on scientific results with as much context and nuance as the medium allows,” especially when the research is complicated, contrary to what most similar studies have found on the same topic, or when the researchers involved have possible conflicts of interest, such as prior or current industry funding.
That presumably includes telling readers when a study involves mice instead of people.