Book Review: ’Science Fictions’ Shows How Human Flaws Undermine Science

A pharmacist holds a hydroxychloroquine pill at the Rock Canyon Pharmacy in Provo, Utah, May 27, 2020. (George Frey/Reuters)

It's not the scientific method, but the scientists themselves.

I n June, The Lancet retracted a major study based on a massive data set.

The study suggested that the drug hydroxychloroquine actively harms COVID-19 patients. But critics highlighted irregularities in the numbers, some of the study’s own authors were unable to audit the data to investigate the concerns, the company that claimed to have collected the information from thousands of hospitals did not appear to have the infrastructure and resources necessary to have done so, and several major hospitals in Australia whose participation would have been needed to make the numbers work told the Guardian they had not been involved. The New England Journal of Medicine had to pull a different study based on the same data.

The entire thing was a perfect stew of bad science, even aside from the core fact that the actual data were, er, “unreliable.” To begin with, these were “observational” studies, which can’t really tell whether a drug is causing a bad outcome; they were not randomized experiments. Thanks to the premature endorsement of hydroxychloroquine by President Trump, the straightforward and boring question of whether the drug worked had become political, so the media were all too happy to trumpet any negative finding. And the journals that ran these studies, despite having peer-review processes, failed to detect the problem.

It’s too bad that this happened too late for Stuart Ritchie to discuss it in his new book Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth, because it illustrates all of his main points. Science Fictions is a handy guide to what can go wrong in science, nicely blending eye-popping anecdotes with comprehensive studies. As the subtitle suggests, Ritchie is concerned with four issues in particular: fraud, bias, negligence, and hype. He explains how each of these get in the way of the truth, and makes a number of suggestions for fixing the process.

The chapter on fraud is easily the most harrowing, because it involves scientists who deliberately mislead their peers and the public. Here we meet Paolo Macchiarini, who claimed to be able to transplant tracheas, including artificial tracheas, by “seeding” them with some of the recipient’s stem cells so the recipient’s body wouldn’t later reject them. Macchiarini published several papers touting his successes. It later turned out that his patients were dying, but it took years for the scandal to come to light and for the institutions involved to admit their mistakes.

Frauds such as manipulated images and fake data can be easy enough to catch when a critic is looking for them, but most journals and peer reviewers tend to start from the assumption that scientists are at least well-intentioned. Heck, fraudsters often are well-intentioned in a perverse sense, trying to advance theories they genuinely believe to be true and important without going through the hassle of proving them. Yet in surveys, about 2 percent of scientists admit to faking data at least once, and a review of thousands of biology papers containing “western blots” (a technique to detect proteins) found that 4 percent included duplicated images.

If it’s easy for scientists to publish fraudulent results, it’s even easier for bias to creep into the process. We conservatives are always skeptical when left-leaning academics produce results that seem a bit too convenient, of course. But just as often, scientists might simply want a result that’s exciting and likely to bring them prestige, or one that undermines a rival.

This kind of bias is encouraged by the publication process. It’s hard to publish a study that arrives at a “null” result, meaning that the researcher tried to find an effect (say, that a certain food makes people energetic or that a certain style of argument changes people’s minds) but the result was statistically insignificant. “Turns out there’s nothing there, at least as far as I could measure” is an incredibly boring conclusion, and most journals aren’t interested in relaying it. When you spread that lack of interest across an entire scientific literature, it introduces bias: Effects look a lot stronger than they are when you ignore all the times they failed to show up.

Other times, scientists are driven to make null results disappear, rather than just giving up and moving to the next project. There are plenty of techniques for doing so. They can separate the data into a bunch of subgroups and check each one separately. They can try different statistical methods. They can use the data to answer questions they didn’t plan on answering. If they run enough different analyses, they’ll find something that hits that magical threshold of statistical significance — even if just by luck.

Some scientists have even “hacked” their data this way without, apparently, realizing they’re doing anything wrong. In 2016, the prominent nutrition researcher Brian Wansink published a blog post in which he described a project that appeared to have “failed” until he told the graduate student analyzing the data to keep trying different approaches to find something to “salvage.” This set critics on Wansink’s trail, and it turned out to be part of a much broader pattern of sloppy work. An email came to light in which he explicitly instructed a subordinate to try “tweeking” [sic] an analysis to make an insignificant result turn significant. Many of his papers were retracted, and he resigned from Cornell.

The problems don’t stop at fraud and bias. There’s also plain negligence. Anyone who follows statistical research is familiar with countless instances in which “coding errors,” “spreadsheet errors,” “data-entry errors,” etc. screwed up a study. (Anyone who’s actually worked with spreadsheets and statistical software has made plenty of those errors, too, and hopefully caught them before it was too late.) Many of these mistakes are probably never discovered, but large reviews looking for specific problems, such as mathematical impossibilities, find that an astonishing number of scientific papers, maybe even half, contain at least one clearly incorrect figure.

Finally, there is the issue of hype. It’s tempting to blame the media when a complicated study loaded with caveats is boiled down to a simplistic, clicky headline, but Ritchie doesn’t let the scientific establishment off that easy. He points out that in many cases, the hype begins with the press releases that accompany major studies, which the study authors themselves typically have a hand in writing. A 2014 study matched press releases with the papers they were based on, and found that the releases regularly encouraged readers to change their behavior in ways the study didn’t justify, buried the fact that research was conducted on animals instead of people, and conflated correlation with causation.

When you add all this together, you end up with vast swaths of scientific literature that can’t be trusted. Ritchie’s own field of psychology was roiled a few years back by the revelation that many oft-cited studies couldn’t be replicated when other researchers tried to rerun them. Similar phenomena have emerged in other fields as well — most disturbingly in medicine, where lives hang in the balance.

There are lots of ways to address these problems — and science’s flaws have garnered increasing attention in recent years, so there’s a chance the establishment will do something about them. Ritchie’s suggestions are too numerous and detailed for me to do justice, but here are a few: Scientists should more frequently “pre-register” their work, meaning they explain what they plan to do ahead of time so they can’t change it later to get better results. Journals should be more willing to publish null results and attempts to replicate previous studies, and might even commit to publishing studies before the results are known. Technology should make it easier to bring the output of statistical-analysis software into the body of a paper, reducing the mundane copy-and-paste errors so many studies seem to suffer from. Funders should insist that all findings, including the null ones, be published. And so on.

The scientific method is sound. But scientists are human beings with human flaws, and it’s only by controlling those flaws that we can make science better.

Editor’s note: This piece has been emended since its original posting, to remove a claim that Ritchie had erred in his discussion of Roland Fryer’s 2016 police-shooting study.

The Problem with Science

You have 1 article remaining.

You have 2 articles remaining.

You have 3 articles remaining.

You have 4 articles remaining.

You have 5 articles remaining.

You are out of free articles!

You have 1 article remaining.

You have 2 articles remaining.

You have 3 articles remaining.

You have 4 articles remaining.

You have 5 articles remaining.