Why are social sciences less scientific than natural sciences? And what does this imply about public policy? To the first question, many people probably would answer, “Because social sciences involve human beings, and human beings sometimes do things that are not predictable.” But that answer is at best shallow, and at worst entirely wrong. Moreover, the fact that human beings are not perfectly predictable has never stopped economists, sociologists, or political scientists from trying to contribute useful knowledge.
Jim Manzi’s book attempts to provide an answer that is both more rigorous and more helpful. Manzi, an entrepreneur and a contributing editor to NR, ends up making a case that social scientists would be better served by (cautiously) undertaking more experiments. Undertaking rigorous experiments is also Manzi’s recommendation to policymakers.
The ideas in this book are important, and I think it belongs on the syllabus of graduate programs and highâ€‘level undergraduate programs in social science and public policy. It is unfortunate that Manzi probably does not have enough academic street cred to gain that sort of audience. For instance, even though he skewers famous studies by renowned Princeton and Vanderbilt politicalâ€‘science professor Larry Bartels and renowned University of Chicago economist Steven Levitt, their position in the professional hierarchy probably makes them impregnable, particularly when attacked by someone from outside the academy.
Manzi introduces a new and useful term to describe the problem of the social sciences: causal density. Causal density means that there are many factors that can affect the phenomena in which social scientists are interested. Think of all of the plausible causes of World War I, the Great Depression, or the recent financial crisis. Causal density can be just as serious an issue when dealing with ongoing social concerns: How can we sort out the causes of, for example, income inequality or differences in educational outcomes?
The problem of causal density also crops up in physical sciences, notably biology. Even though there is strong evidence of heritability of diseases and other characteristics, the hopes of pinning these traits down to specific genes or sets of genes have faded. There is too much causal density.
For me, the paradigmatic case of causal density is macroeconomics, as typified by the question of how effective fiscal stimulus is in ameliorating a recession. We want to know whether, all other things being equal, more government spending raises output and employment. History, however, does not hold other things equal.
When experiments are not practical, we rely on observational data. Manzi points out that this worked in the case of establishing a link between smoking and lung cancer. In that context, the circumstances under which observational studies can demonstrate causality were spelled out by epidemiologist Austin Bradford Hill. Among them are strength of relationship, consistency of relationship, dosageâ€‘response relationship, plausibility, and coherence with other scientific findings.
The challenge in judging the effect of government deficits on economic performance is that the data that are available do not satisfy the Hill criteria. For example, one does not observe a consistently positive relationship between deficit spending and economic outcomes; in fact, one observes quite the contrary, that large deficits are associated with weaker economic performance. Turning to Hill’s other criteria, a positive relationship between deficit spending and economic outcomes is plausible and coherent for Keynesians, but not for economists who subscribe to classical theory. This debate has persisted ad nauseam.
Manzi argues that where controlled experiments are feasible (i.e., not in macroeconomics), they can provide a better, albeit imperfect, solution to the problem of causal density. For example, if one is testing a new pedagogical technique, one can randomly assign some students to be taught the old way and others to be taught using the new method. Many of the most trustworthy findings in social science have come from such experiments. There is a famous Rand study, now nearly three decades old, of healthâ€‘insurance policies with different deductibles. Also famous are the various experiments testing Milton Friedman’s idea of a negative income tax as a tool to alleviate poverty.
I was once seated at a dinner table next to an official of the Department of Education involved in education research. I made an impassioned plea for more controlled experiments in education. The official responded by asking, “Would you want your child to be the subject of an experiment?” At this, my jaw dropped, and I sputtered, “They do it to my children all the time! They constantly introduce curriculum changes, scheduling changes, and changes in teacher methods. They just don’t bother to evaluate whether or not it works.”
Statisticalâ€‘qualityâ€‘control guru W. Edwards Deming used the term “tampering” to describe this process of introducing changes without rigorously evaluating results. Tampering and experiments are two ways of disturbing the status quo. But only experiments are designed with the intent of producing reliable measurements of success or failure.
Like my dinner companion, most policymakers view experiments as at best costly and at worst immoral. Even though tampering is just as bad, if not worse, it somehow escapes such criticisms.
Manzi points out that most social experiments are too small and too limited in their initial conditions. Much is made of the Perry Preschool Experiment, conducted in one location with fewer than 150 students. Manzi argues that the best practice is to conduct multiple experiments in a variety of initial conditions. He concludes that in fields with high causal density, experimental methods are a significant tool for producing reliable results, and that a single experiment is much less reliable than multiple, replicated experiments. Most new programs and policies fail to achieve their desired results, and it would be better to discover this beforehand, using experiments. (Of course, to the extent that policymakers do not want to recognize failures, they will not want to conduct experiments.)
Looking at experimental results, Manzi notes a general finding that “programs that attempt to improve human behavior by raising skills or consciousness are even more likely to fail than those that change incentives and environment.” It is really hard to fix flaws in human character.
Manzi argues that the value of experiments bolsters the case for federalism, because states can be laboratories for what works in social policy. But I am not sure that his case for this is sound. In theory, if Washington were to approach social policy by conducting rigorous, controlled experiments in order to determine what works, that might be better, on Manzi’s own terms, than leaving the 50 states alone to engage in unsystematic tampering. I think that the case for federalism is actually more subtle: Attempting to change the skills or consciousness of officials in order to influence them to conduct rigorous experiments as part of the policy process is unlikely to work. But creating an environment in which incentives lead them to adopt experimental methods has a better chance of success. A more competitive political system, of the kind a decentralized structure might provide, could create this sort of environment.
This is a provocative book for people who are interested in how social science relates to public policy. I am confident that most of the people who read it will benefit from it. I am much less confident that most of the people who would benefit from it will read it. That reflects my pessimistic view of today’s intellectual culture, particularly in the academy.
– Mr. Kling is an economist and the author, most recently, of Unchecked and Unbalanced: How the Discrepancy Between Knowledge and Power Caused the Financial Crisis and Threatens Democracy. He writes for EconLog at econlog.econlib.org.