NR Digital

Color by Numbers

by Fred Schwarz
For employers who want to test job applicants, it’s damned if you do and damned if you don’t

Among the few Americans who have heard of Chester Alan Arthur, his main accomplishment as president is generally considered to be civil-service reform. The Pendleton Act (1883), adopted at Arthur’s urging, provided for most federal jobs to be filled through competitive tests instead of being handed out by political bosses. The principle was eventually adopted by most state and local governments as well.

Today, however, the feds are suing local governments because they use civil-service tests. The Department of Justice has challenged a New Jersey state-police sergeants’ exam because whites passed at a somewhat higher rate (89 percent) than blacks and Hispanics (73 and 77 percent, respectively). In New York City, a written exam for prospective firefighters appears likely to be thrown out for similar reasons. The city of Chicago, in despair of finding a test that can withstand this sort of scrutiny and still be useful, may simply abolish its entry exam for police trainees.

All this is the unintended result of the Civil Rights Act of 1964. Title VII of that statute banned racial discrimination in hiring by public and private employers and created the Equal Employment Opportunity Commission (EEOC) to enforce it. When the law was passed, its sponsors insisted that there would be no need for quotas or numerical targets, just an end to discrimination. A Senate memorandum explained that “an employer may set his qualifications as high as he likes, he may test to determine which applicants have these qualifications, and he may hire, assign, and promote on the basis of test performance.”

The reality has been something different. Consider the case of Ford Motor Company. In 1991 it adopted a written test, measuring basic arithmetical and cognitive skills, for entrants into its apprentice program for electricians, machinists, and pipefitters. The test’s results showed a sharp racial disparity: In one case, 4 percent of blacks passed, compared with 37 percent of whites. Yet that test was completely legal — at first.

Before going live, the Ford exam had been broken down, refined, adjusted, verified, and validated by a small army of psychologists and statisticians. As required by law, the relevance of every question to the jobs that apprentices would be trained for had been demonstrated beyond any hope of a challenge. But before the decade was over, and without any evidence that it was unfair, that same test suddenly became invalid — making Ford subject to financial penalties and other sanctions.

The chink in Ford’s armor was that simply getting a test legally validated in this way isn’t enough. Under the EEOC’s Uniform Guidelines on Employee Selection Procedures (UGESP), any test an employer uses must have the least adverse impact on protected groups out of all available alternatives. So as the science of testing advances, an employer must continually update its job analysis, survey the possible alternatives, and show that the test it’s using is still the least adverse choice.

The workers who sued Ford didn’t have to show that there was anything wrong with the test Ford was using, just that the company hadn’t proven that it remained the least-bad option. A complaint filed with the EEOC in 1998 led to a finding of probable cause; a class-action lawsuit was filed, along with an action by the EEOC; and in 2005, after what one lawyer called “intense, adversarial negotiations,” Ford agreed to a settlement under which it paid $10 million to class members and their attorneys and agreed to hire 280 black apprentices.

By the way, that settlement and most of the litigation leading up to it happened under the Bush administration. Barack Obama thinks they were a bunch of wimps. In his State of the Union address he crowed: “My administration has a Civil Rights Division that is once again prosecuting civil-rights violations and employment discrimination.” Eric Holder’s Justice Department has made “disparate impact” enforcement a top priority, with large budget increases to match.

Holder also seems intent on expanding the meaning of “disparate impact.” Under the UGESP, a black-white disparity in pass rates of 20 percent or less will generally put an employer in the clear — unless the government thinks the employer has discouraged black applicants in the past, or has not kept sufficient records to prove that it hasn’t. (The employer can purchase an indulgence of this sin by instituting an affirmative-action program.) The New Jersey exam, which is also used by many local police departments in the state, fell within this 20 percent limit, yet the Justice Department is going after it anyway.

So the options facing any company with 15 or more employees that wants to avoid a lawsuit are unappealing. Fred W. Alvarez, a Reagan-era EEOC commissioner who now practices employment law, calls the situation a “trilemma.” At one extreme, you can shell out large sums of money, often in the hundreds of thousands of dollars or more, to get a test validated (though even a solidly validated test can be challenged, and the obloquy associated with being publicly accused of racism is enough to make some employers shy away from testing). At the other extreme, you can rely on subjective measures, such as interviews, and pray that nobody files a complaint. Somewhere in between, you can find or develop a test that seems useful, do whatever you can afford to make it fit the job you’re hiring for, and then cross your fingers and hope for the best.

Want to reduce hassles by using a professionally prepared, pre-approved exam from a testing company? No dice. Any test must have its relevance demonstrated for the specific job you’re hiring for, and it must be proven non-discriminatory not just in general, but for the specific pool of applicants that the EEOC deems relevant to your industry and area. To be fair, though, they’re not fanatics about this: Any racial, ethnic, or gender group that forms less than 2 percent of the applicant pool can be ignored for validation purposes.

If you want to hire, say, 50 people, and you decide to take the 50 highest scorers on your test, you’re asking for trouble. When you give applicants an exam, the passing score must be set at the lowest level that’s consistent with the criteria listed in the job analysis (which, in turn, must include only “critical or important” duties). Everyone who scores above this minimum level must be treated equally. “The more, the better” reasoning (for instance, the more weight an applicant can lift, the better firefighter he will be) is not acceptable.

If you test employees by having them perform actual job duties, you must show that the people evaluating them, and the procedures the evaluators use, are free from “the possibility of bias,” and that “the manner and setting of the selection procedure and its level and complexity . . . closely approximate the work situation.” If, on the other hand, you give applicants a written test or a set of made-up tasks, everything will be “closely reviewed for job relevance.” How closely? The “technical standards” section of UGESP contains hundreds of instructions for conducting a validity study — verify this, prove that, examine the methods used, document every fact and figure, and keep meticulous, voluminous records of it all — though only about 90 of them are marked “essential.”

In all these cases, remember, the government does not need to prove that a test is unfair to Group X; all it needs to show is that Group X does worse on the test, and that the employer has not proven its validity. There’s no such concept as good faith, no credit for a reasonable belief that a test is job-related, and no weighing of costs against benefits. Moreover, the guidelines let the American Psychological Association decide which methods are “accepted by the psychological profession,” and thus must be used in drawing up, validating, administering, and scoring an employment test. Trained psychologists will be required for all these steps. If you ask the members of the APA how much of their services you need, what do you think they will say?

Yet for all their emphasis on cutting-edge psychological research, the Uniform Guidelines were adopted in 1978, before most companies even had computers, and they have not been updated since. The reason is that some recent statistical advances (like combining numerous small samples into a single large one with greater reliability), and some experience-based modifications (like allowing greater use of standardized exams), would make testing easier. That’s exactly what the civil rights–academic complex doesn’t want, because it would encourage the use of testing, which civil-rights activists hate, and reduce the need for bespoke services from highly trained (and paid) experts. So technical advances that increase the requirements on employers are adopted under the rubric of “professional standards”; advances that would decrease the requirements are excluded by sticking to language frozen in the disco era.

In theory, employers could avoid all these problems by using a test on which whites and blacks (along with women, Hispanics, and other protected groups) score equally well. When that happens, the test is automatically acceptable, and nobody cares if it has anything to do with the job requirements. Unfortunately, as the respected industrial psychologist James L. Outtz has pointed out, “the average score for minority applicants is almost always lower than that for non-minority applicants, regardless of the test,” and this remains true not just with pencil-and-paper tests but “whether it’s a work sample, whether it’s an in-basket, whether it’s video-based.” So the burden of proof lies with the employer in virtually every case.

Complicating matters still further is the Supreme Court’s recent decision in Ricci v. DeStefano, the New Haven firefighters case, where it ruled that an employer may not ignore a test it has administered simply because one group outscored another by enough to invite a lawsuit. Basically, the courts are enforcing one version of equal opportunity, the Department of Justice is enforcing another, and employers are in the middle getting fired on by both sides. No wonder that Chicago and many other employers are tempted to abandon a century of psychological research and forgo testing entirely — which is exactly what many minority-group advocates hope for.

After the 1964 Civil Rights Act was passed, there were two ways the civil-rights movement could have gone. It could have used the law’s sanctions as a weapon against intentional hiring bias, which was still widespread at the time, while in the longer term working to bring minorities’ performance up to par; instead, it did its best to brand almost all tests as racist. If employers commonly require a high-school diploma or administer written exams, you can either help people stay in school and learn to read, or else sue to strike down the requirements. Over the years, the civil-rights establishment has leaned ever more heavily on the latter course. The result is suspicion and resentment on all sides, a new racial spoils system, and a huge financial and regulatory burden on employers — but plenty of work for lawyers, statisticians, and industrial psychologists.

Send a letter to the editor.