The New York Times has a story on a rule the Environmental Protection Agency is working on, complete with a leaked draft. The gist is that, under the rule, the EPA would not rely on scientific studies to support regulations unless the “data and models” underlying the findings were released to the public.
There is an enormous upside to this. When scientists are completely transparent about their methods and openly provide their data, other researchers and skeptics can point out errors and otherwise refine the analysis. That’s a boon to science — and a check on researchers who massage their analyses until they get the output they want. Under the regulation, we’d never again hear the words, “Our data prove that we need massive new environmental regulations, but no, you can’t see them.”
As the Times unsurprisingly focuses on, however, there are also big downsides. The studies the EPA relies on often involve personal health information collected with a guarantee of confidentiality, or proprietary business data. The rule would apply retroactively, so it could eliminate studies the agency has been relying on for decades, including when longstanding regulations come up for renewal. (Apparently the biggie is the “Six Cities Study,” which showed higher deaths from lung cancer and cardiopulmonary disease in places with more air pollution, especially “fine particulates.” The data were not made public, though there was a reanalysis by a research group jointly funded by the EPA and the car industry.)
Further, it can be difficult and expensive to truly anonymize data so that they’re suitable for public release. It’s not just a matter of stripping off the names; you also have to make it hard for nosy neighbors to identify people based on other details. And there’s no consensus on how anonymized is anonymized enough: The Census Bureau has whipped up a controversy over a proposal to anonymize data far more aggressively than it has in the past, which would make the information a lot less useful to researchers.
I’m not convinced the leaked rule is the right course, but the document also spells out two highly promising alternatives. In one, the EPA wouldn’t necessarily ignore studies without public data, but could give them less weight. A variation on this alternative would apply only to data and models created after the rule went into effect, so researchers would be on notice that if they wanted the EPA to give their work a lot of weight, they’d need to design their studies with transparency in mind. This seems entirely fair.
In another, the EPA would develop a “tiered” system in which access to data could be restricted to various degrees depending on the privacy concerns at issue. This could include limiting access to qualified researchers who work with it at “secure data enclaves.” That could create a lot of expense and headaches — but given that even your tax records have been made available to researchers, it’s certainly workable.