The Corner

What’s ‘Significant’ about the Oregon Medicaid Experiment?

Anyone interested in the debates about the future of American health-care policy has paid acute attention to the Oregon Medicaid experiment, whose second round of results was released at the beginning of this month. The study arose from a natural experiment in which Oregon (my home state) had a few more resources available than the state government anticipated and decided to have a lottery for additional entry into Medicaid. Two years later, we can have some idea of just what Medicaid does.

This latest round of results from Oregon has some interesting implications. Health-care wonks on the right point to the finding that Medicaid has not improved physical health under common standards of statistical significance. If many of its promised benefits are just statistical noise, Medicaid is clearly a poor use of hundreds of billions of dollars. This is especially relevant because the bulk of Obamacare is just an expansion of Medicaid. Health-care wonks on the left have pointed to the statistically significant improvements in mental health and financial stability, while also reminding us that a lack of statistical significance does not mean we can conclude Medicaid does nothing positive. They point out that those on Medicaid do better in a variety of health metrics such as cholesterol and high blood pressure, but not by great enough margins that we cannot credit simple “statistical noise” for the difference.

Enter Jim Manzi, a peerlessly persuasive proponent of using experiments in policymaking. In a careful post for The Daily Beast, Manzi shows that, if we suspend statistical conventions, it is equally plausible that Medicaid actually made its enrollees worse off. Those not on Medicaid had higher risk scores for heart disease, which might have something to do with the fact that those on Medicaid were more likely to be smokers, which might in turn have something to do with the way things that make us feel safer (Medicaid) can lead us to engage in riskier behavior (smoking). Liberal policy thinkers tend not to have Manzi’s keen eye for unintended consequences like these — and severe unintended consequences could very well be part and parcel of Medicaid.

In response to Manzi’s post, Kevin Drum of Mother Jones has a post that quibbles with Manzi’s word choice:

Many of the results of the Oregon study failed to meet the 95 percent standard [of confidence that the observed effect isn’t noise], and I think it’s wrong to describe this [as the study’s authors do] as showing that “Medicaid coverage generated no significant improvements in measured physical health outcomes in the first 2 years.”

To be clear: it’s fine for the authors of the study to describe it that way. They’re writing for fellow professionals in an academic journal. But when you’re writing for a lay audience, it’s seriously misleading. Most lay readers will interpret “significant” in its ordinary English sense, not as a term of art used by statisticians, and therefore conclude that the study positively demonstrated that there were no results large enough to care about.

But that’s not what the study showed. A better way of putting it is that the study “drew no conclusions about the impact of Medicaid on measured physical health outcomes in the first 2 years.” That’s it. No conclusions.

Drum’s general preference for eschewing unnecessarily arcane and technical wording in policy writing is admirable. But Drum’s post is mostly beside the point: Manzi is careful with his language and leaves us with the correct impression that, notwithstanding the Oregon data, we still know very little.

His post is also pretty deceptive: It distorts Manzi’s meaning when attempting to translate his argument into non-technical language. Drum believes that the clearest statement of the report’s findings would be to say that it “drew no conclusions about the impact of Medicaid.” His complaint is that, because the average reader understands “significance” as indicating an effect of great magnitude rather than an effect that has a low probability of being due to statistical noise, a finding with “no significance” will be incorrectly understood as an effect of little or no magnitude.

People might think we are accepting the null hypothesis (Medicaid does nothing) when in fact we can’t accept anything out of the range of possible options: Medicaid helps X amount, Medicaid hurts X amount, and Medicaid neither helps nor hurts. That’s true as far as it goes. But the problem with that logic is that we’re using frequentist statistics, so we cannot ever accept the null. We can only reject the null, meaning we can only be statistically sure enough that Medicaid helps or hurts. In saying what we’ve learned isn’t “Medicaid did nothing important” but is actually “we can draw no conclusions about the impact of Medicaid,” Drum leads his readers astray. It’s not as if we just don’t have enough information to draw the conclusion that Medicaid did nothing important, it’s that we could never draw the conclusion that “Medicaid did nothing important.” It’s methodologically impossible. So what happens if a program costing hundreds of billions of dollars actually does nothing? When will we have enough information to plausibly decide to reform it? The answer, according to Drum’s logic, is never. This sort of statistical argument blindly favors the status quo. If reformists argue on these terms, they will never have enough information to actually reform.

Reformers on both sides of the aisle need to live in this world, and that means dealing with unconquerable uncertainties. In other words, you sometimes need to act as if you can affirm the null, rather than merely reject it. We might not be there yet — I would certainly like there to be more experiments like Oregon’s — but at some point we have to actually make policy given our limited information.

— Jeremy Rozansky is an assistant editor at National Affairs.


The Latest