In a prolix article at Quartz.com, writer Olivia Goldhill delineates comprehensively how a single test administered to the freshman class at Yale University in 1998 has been used to justify claims of “implicit bias” commonly trumpeted by the Left, and how the original test, much to the discomfiture of those seeking to wield the epithet to target others, has serious flaws.

The test, named the Implicit Association Test (IAT), was ostensibly designed to reveal and measure unconscious racism, writes Goldhill, and had subsequently been administered to millions of people worldwide and justify referencing implicit bias in perpetuating the myth of the gender pay gap or racist police shootings.

Goldhill notes that of various versions of the test (there are more than a dozen), the test receiving the most attention has been the black-white race IAT, which assumes that accurate reflections of implicit bias are gained from identifying words as “good” or “bad” as quickly as possible. She notes, “The slower you are and the more mistakes you make when asked to categorize African-American faces and good words using the same key, the higher your level of anti-black implicit bias — according to the test.”

After a lengthy explanation of how supposed implicit biases have been addressed by various companies, Goldhill points out that the paths used by various companies to address the problem have been largely ineffectual.

Patrick Forscher, psychology professor at the University of Arkansas, commented, “I was pretty shocked that the meta-analysis found so little evidence of a change in behavior that corresponded with a change in implicit bias.” He added, “I currently believe that many (but not all) psychologists, in their desire to help solve social problems, have been way too overconfident in their interpretation of the evidence that they gather. I count myself in that number. The impulse is understandable, but in the end it can do some harm by contributing to wasteful, and maybe even harmful policy.”

Goldhill notes that many of the people she knows refuse to criticize the idea of implicit bias among friends and colleagues. She writes,One friend said her colleagues wouldn’t discuss diversity at all were it not for the implicit-bias workshops. So, as she so bluntly asked: Why was I stirring up the shit?”

Then Goldhill gets down to the crux of the matter:

In recent years, a series of studies have led to significant concerns about the IAT’s reliability and validity. These findings, raising basic scientific questions about what the test actually does, can explain why trainings based on the IAT have failed to change discriminatory behavior.

First, reliability: In psychology, a test has strong “test-retest reliability” when a user can retake it and get a roughly similar score. Perfect reliability is scored as a 1, and defined as when a group of people repeatedly take the same test and their scores are always ranked in the exact same order. It’s a tough ask. A psychological test is considered strong if it has a test-retest reliability of at least 0.7, and preferably over 0.8.

Current studies have found the race IAT to have a test-retest reliability score of 0.44, while the IAT overall is around 0.5 (pdf); even the high end of that range is considered “unacceptable” in psychology. It means users get wildly different scores whenever they retake the test …

The second major concern is the IAT’s “validity,” a measure of how effective a test is at gauging what it aims to test. Validity is firmly established by showing that test results can predict related behaviors, and the creators of the IAT have long insisted their test can predict discriminatory behavior. This point is absolutely crucial: after all, if a test claiming to expose unconscious prejudice does not correlate with evidence of prejudice, there’s little reason to take it seriously.

Goldhill notes:

In a 2014 paper (pdf) by Banaji, Greenwald, and Nosek, the authors seemed to acknowledge the concerns raised about the test: “IAT measures have two properties that render them problematic to use to classify persons as likely to engage in discrimination,” they wrote, pointing to the test’s poor predictive abilities and test-retest reliability.

Yet when reality interferes with your desired results, disavow reality as quickly as you can, apparently:

But when I asked them directly, Greenwald and Banaji doubled down on their earlier claims. “The IAT can be used to select people who would be less likely than others to engage in discriminatory behavior,” wrote Greenwald in an email.