When the Head Start Impact Study generally failed to show cognitive or behavioral improvements that lasted beyond kindergarten, Head Start’s defenders pointed to possible “sleeper effects” as a reason to keep the program going. The argument is that Head Start may have imparted a benefit that is not detectable in the elementary years but that emerges later on. A new paper from Brookings’ “Hamilton Project” follows in that tradition, claiming that Head Start improves high school graduation rates, college attendance, self-control, self-esteem, and parenting practices.
Unlike the Impact Study, these new findings do not come from a randomized controlled experiment. Instead, the Hamilton authors try to isolate the effect of Head Start by comparing children who attended Head Start with a sibling who did not. The underlying assumption is that the sibling pairs have the same average characteristics except for Head Start attendance. The authors acknowledge that this assumption does not necessarily hold. After all, parents may tend to enroll the children they believe are best suited for the program, confounding the Head Start children’s natural advantages over their siblings with the effects of the program itself.
So, for purposes of evaluating Head Start, can a sibling comparison really generate valid treatment and control groups as an experiment would? In my opinion, that claim is hard to justify now that we have first-grade and third-grade follow-up data from the Impact Study to compare with the non-experimental evidence. I’m simplifying somewhat, but the basic story is this: The Impact Study showed Head Start children with higher test scores at the end of the Head Start year, but the advantage was gone by the end of kindergarten. By contrast, Hamilton’s sibling-versus-sibling methodology shows the Head Start children with test score gains that persisted to age 10 (and age 14 for boys).*
If the Hamilton study is really isolating the effect of Head Start the same way the Impact Study does, we should not see Hamilton’s Head Start children doing so much better in the early grades. That divergence from the experimental evidence suggests that Head Start children are systematically different than their non-Head Start siblings, and the long-term Head Start benefits reported by Hamilton may be an artifact of that difference.
Granted, testing for sleeper effects is not easy. Even if we had the time and money to conduct a large-scale, multi-site experiment over decades, the results could be out of date as soon as they come in. For example, one study favored by Head Start supporters concluded that the poorest children born in the 1960s and 1970s had lower childhood mortality rates because of Head Start. Given the major advances in healthcare access and quality in the last several decades, how relevant is that finding to 2016?
Since long-term studies may not be conclusive, researchers should focus more on measures that capture the short-term changes that supposedly produce long-term gains. (Maybe enhancing a child’s “grit”?) If sleeper effects are real, Head Start must be doing something to children in the here and now. Measure it.
* The Hamilton study borrows its methodology from a 2009 paper by my former grad-school colleague David Deming. The test score data I’m citing come from his paper. The Hamilton authors do not report any test scores.