Higher Education, Data Transparency, and the Limits of Data Anonymization

The editors of Bloomberg View tout various state-level initiatives to contain the rising cost of higher education. Most of these ideas are perfectly reasonable, and likely to do at least some good. But I am increasingly convinced that unless governments do a better job of measuring student learning and labor market outcomes, any reform efforts will be of limited use. Advocates of higher education reform often argue that we ought to reward the most effective institutions, i.e., the institutions that do the most to improve student outcomes per dollar spent. The problem, however, is that we don’t have very good tools for assessing outcomes. Andrew P. Kelly and Daniel K. Lautzenheiser of the American Enterprise Institute offer two ideas for how states might address this data vacuum:

States should require institutions to measure student learning outcomes in a rigorous, reliable, and comparable way. This is not to suggest that states should coerce all institutions to use the same standardized test or that these tests should be a requirement for graduation. Rather, institutions should have the opportunity to choose from a menu of assessments, with the results made public. Administering an exam twice during a student’s tenure can allow institutions to measure the value added by the institution as a whole, providing less-selective institutions with an opportunity to showcase the gains their students make while in attendance. At the two-year level, policymakers should also report the proportion of remedial students who went on to successfully complete credit-bearing courses.

And Kelly and Lautzenheiser also call for providing students, parents, and taxpayers with more useful and reliable information on labor market outcomes:

Similarly, states should take steps to link data on postsecondary experience with earnings and employment information. Some institutions already try to measure employment and earnings using graduate surveys, but these are expensive to conduct and often suffer from low response rates. Linking administrative data from postsecondary and wage records is likely to be more informative and less expensive in the long run (despite start-up costs). With these data systems in hand, states would ideally be able to connect average earnings to both institutions and degree programs. Done right, this would enable prospective students to say, “If I am an accounting major at Eastern State University, the average wage one year after graduation is $50,000,” and then compare that to accounting programs at other institutions.

Students who are unsure about what type of program to pursue or what to major in may find this useful as well, as it will provide them with a sense of what credentials are likely to lead to a good job. These data can be particularly helpful to combat the myth that a bachelor’s degree is the only path to the middle class. Evidence suggests that graduates from some two-year and certificate programs outearn those with bachelor’s degrees, at least in the near term.

Kelly and Lautzenheiser have many other promising ideas for state governments in their report, “Taking Charge: A State-Level Agenda for Higher Education Reform.” One of them, a call for “Charter Universities,” might appeal to conservatives who hope to encourage business model innovation and liberals critical of for-profit higher education, which largely exists to meet the needs of nontraditional students ill-served by higher education incumbents. My concern is that while at least some states are inching towards the kind of learning and labor market assessments Kelly and Lautzenheiser champion, state-level efforts are limited in their utility relative to a federal effort to create a student unit record system, which would leverage the wage data collected by the Social Security Administration to give a clearer picture of how graduates of various higher education programs fare. But back in May, Kelly observed the following:

The federal government is uniquely positioned to collect data that could help students make better choices. The feds have already invested $500 million in state longitudinal data systems that could provide more comprehensive student success measures. And the Social Security Administration already collects wage data on all workers; a simple match could link labor market information to post-secondary experience.

Unfortunately, in 2008 Congress went in the opposite direction, explicitly banning the federal government from collecting individual-level data on college students. Some higher education interests argued that the ban on a student-unit record system was critical to protect student privacy. It coincidentally helps to protect colleges and universities from the wrath of better-informed consumers. This is not an argument to plug new data into ham-handed accountability measures, but to empower consumers to vote with their tuition dollars.

The fundamental challenge is that linking individual-level labor market information to post-secondary experience while protecting privacy requires data anonymization, yet there are real questions about whether true data anonymization is even possible. Pete Warden addressed this question in 2011:

Precisely because there are now so many different public datasets to cross-reference, any set of records with a non-trivial amount of information on someone’s actions has a good chance of matching identifiable public records. Arvind first demonstrated this when he and his fellow researcher took the “anonymous” dataset released as part of the first Netflix prize, and demonstrated how he could correlate the movie rentals listed with public IMDB reviews. That let them identify some named individuals, and then gave access to their complete rental histories. More recently, he and his collaborators used the same approach to win a Kaggle contest by matching the topography of the anonymized and a publicly crawled version of the social connections on Flickr. They were able to take two partial social graphs, and like piecing together a jigsaw puzzle, figure out fragments that matched and represented the same users in both.

Warden recommends a variety of strategies that might address this problem, e.g., limiting the detail of the information provided, but this would also limit the usefulness of the underlying data. My own view is that we ought to take Warden’s second recommendation seriously, which is to acknowledge and accept the risk of de-anonymization in light of the benefits that greater data transparency would provide, while doing what we realistically can to limit the risk. 

Reihan Salam — Reihan Salam is executive editor of National Review and a National Review Institute policy fellow.

Most Popular


The Gun-Control Debate Could Break America

Last night, the nation witnessed what looked a lot like an extended version of the famous “two minutes hate” from George Orwell’s novel 1984. During a CNN town hall on gun control, a furious crowd of Americans jeered at two conservatives, Marco Rubio and Dana Loesch, who stood in defense of the Second ... Read More
Law & the Courts

Obstruction Confusions

In his Lawfare critique of one of my several columns about the purported obstruction case against President Trump, Gabriel Schoenfeld loses me — as I suspect he will lose others — when he says of himself, “I do not think I am Trump-deranged.” Gabe graciously expresses fondness for me, and the feeling is ... Read More
Politics & Policy

Students’ Anti-Gun Views

Are children innocents or are they leaders? Are teenagers fully autonomous decision-makers, or are they lumps of mental clay, still being molded by unfolding brain development? The Left seems to have a particularly hard time deciding these days. Take, for example, the high-school students from Parkland, ... Read More
PC Culture

Kill Chic

We live in a society in which gratuitous violence is the trademark of video games, movies, and popular music. Kill this, shoot that in repugnant detail becomes a race to the visual and spoken bottom. We have gone from Sam Peckinpah’s realistic portrayal of violent death to a gory ritual of metal ripping ... Read More