The Situationist

Tierney’s Skepticism at the New York Times

Posted by Jerry Kang on November 19, 2008

Recently, John Tierney who writes a Science column in the New York Times has shown great skepticism about the concept of implicit bias, how it might be measured (through the Implicit Association Test), and whether it predicts real-world behavior. See, e.g.,  Findings column (Nov. 17, 2008).    I write to make provide praise, critique, and cultural commentary.

First, praise.  I praise Tierney’s skepticism, which is fundamental to critical inquiry generally and good science especially.  Serious, critical inquiry is why most of us got into academics, and it’s why you the reader are reading this blog.

Second, critique.  But skepticism should not be one-sided.  Tierney’s columns suggest that one side is just asking for good, skeptical science, whereas the other side is pushing along a politically correct agenda recklessly.  That is hardly fair and balanced.  To take one example, Tierney gives prominent weight to Prof. Phil Tetlocks’ criticisms of the implicit bias research.   But let’s probe further.  In an article by Tetlock and Prof. Gregory Mitchell (UVA Law) attacking the science, the authors suggest that one of the reasons that Whites may perform worse on the Black-White IAT is because of a phenomenon called “stereotype threat”.  They write that Whites “react to the identity threat posed by the IAT by choking under stress–and performing even worse on the IAT, thus confirming the researchers’ original stereotype of them.” 67 Ohio St. L.J. 1023, 1079 (2006).

For this “choke under threat” explanation, Tetlock and Mitchell cite a single study.  Moreover, they do not turn their powerful skepticism against this body of work, launched by Prof. Claude Steele at Stanford, which explains why negative stereotypes can depress test performance.  This body of work, if taken as seriously as Tetlock and Mitchell do in a throw-away line, challenges the use of standardized examinations in university admissions.  But I doubt that that’s what Tetlock and Mitchell would call for, as a matter of policy.  So why not be methodologically pure and go after the stereotype threat work with equal vigor and skepticism?  Instead, they deploy “stereotype threat” science without raising an eyebrow, since it fits their arsenal of critique of the “implicit bias” science.

The general point is that it’s facile to think that one side has the scientific purists — just seeking good data and good science, and the other side has the political hacks.  And self-serving reasoning no doubt infects us all, on both sides.  This is why we should trust long-run scientific equilibrium and be skeptical of both aggressive claims and their backlashes.

Third, cultural commentary.  The readers’ comments to the Tierney articles are fascinating because they largely give no deference to scientific expertise.  From the large N of 1, those who have taken the IAT conclude that the test must be nonsense and raise myriad confounds (without bothering to read the FAQs that explain how stimuli are randomized, etc.)  If geneticists were debating the meaning of some expressed sequence tags or if astrophysicists were debating new evidence of dark matter, I wonder if readers would bother to chime in aggressively with their views.  “I have plenty of genes, and that view about inheritability is nonsense!”  “I’ve seen stars, and if I can’t see ’em they must not exist!”

I suggest that we feel so personally connected to race and to gender (most of the comments focus on race) and are so personally invested in not being biased that we feel compelled toward such participation.  Again, if some “coffee increases likelihood of ulcers” study came out, would people write in:  “I drink coffee, and I don’t have an ulcer!!!”  I don’t think so.   What does that say about our current cultural moment?  Perhaps it reveals a sort of intellectual prejudice­-a proclivity not to take race research seriously, as nothing more than personal opinion, regardless of the scientific and statistical bona fides.

* * *

Look, science always involves conflict.  And in the long run, there’s no reason to think that this controversy won’t be resolved through the traditional scientific method and reach a long-run equilibrium consensus.  But getting there has already been rocky and will continue to be.   Maybe the implicit bias work, which is far more extensive than just the implicit association test (IAT), will turn out to be nothing more than “intelligent design”–just ideology (in that case religious) wrapped up in pseudo-science.  Or, and I think this is far more likely, it will be another inconvenient truth that is established, as global warming ultimately was:  We are not as colorblind as we hope to be, and on the margins, implicit associations in our brain alter our behavior in ways that we would rather they not.  Certainly the balance of peer-reviewed studies in number and quality point in that direction.

In the end, time truly will tell.  The real question is which side will maintain its scientific integrity when the results come in.

Full disclosure:  I’m a co-author of Mahzarin Banaji, whose work is discussed in Tierney’s pieces.  You can read my implicit bias work at:

One Response to “Tierney’s Skepticism at the New York Times”

  1. Consequentialist said

    “Again, if some “coffee increases likelihood of ulcers” study came out, would people write in: “I drink coffee, and I don’t have an ulcer!!!” I don’t think so.”

    Actually, they make comments like that all the time across a whole range of science reporting. I’ve seen researchers in a number of fields making your complaint, without knowledge of the plight of their counterparts, assuming the grass is greener. Even physicists have to deal with folk physics.

    “We are not as colorblind as we hope to be, and on the margins, implicit associations in our brain alter our behavior in ways that we would rather they not.”

    I agree. But the dichotomy between this statement and ‘pseudoscience’ leaves out the questions of ‘which margins’ and ‘by how much’, wide open. I want to know if effects have quantitative significance, as opposed to statistical significance, and to get an accurate estimate of that quantitative significance.

    “So why not be methodologically pure and go after the stereotype threat work with equal vigor and skepticism?

    Instead, they deploy “stereotype threat” science without raising an eyebrow, since it fits their arsenal of critique of the “implicit bias” science.”

    If we’re going to talk about stereotype threat, we should probably also talk about it as a precedent for the extreme overhyping of research in subtle effects of racism/stigmatization/stereotyping, and part of a general pattern of distortion.

    The initial stereotype threat study was grossly misreported by the media for years as showing that test score gaps ‘disappeared’ in conditions designed to reduce stereotype threat. In fact, the study only showed that students from different groups with the same SAT scores performed the same on a test in the low-threat condition, and the effects of stereotype threat across the board tend to be quantitatively small, e.g. the ETS experiments. When media and policymakers take a possible explanation for a very small portion of an effect as an explanation for the entire effect, they will miss the real causes and enact mistaken policies.

    When Scarr conducted her famous transracial adoption study, which showed early increases in IQ for black and biracial children adopted into white middle class families (although gaps remained, with biracial children performing intermediately between black and white), the early results were loudly trumpeted. The later results, which showed that the effects faded away in adulthood (like most environmental interventions, as the heritability of IQ increases with age) were held back from publication for years (Scarr considered suppressing the data permanently), and in the final paper the damning data were tucked away and a misleading discussion added.

    Psychometricians under intense pressure and the Department of Labor falsely (and pretty transparently so) claimed that ‘race-norming’ test scores improved predictive validity, and millions of people and many employers were given race-normed scores without explanation before the practice was banned by Congress.

    In addition to hyped claims supporting racism-based explanations, there are all sorts of contrary facts that are consensus science in the relevant fields. For instance, the APA consensus statement on intelligence notes large phenotypic group differences in IQ, that IQ predicts job performance increasingly well with job complexity, and that IQ scores are predictively valid across groups, while remaining agnostic on the cause of those differences. Combined with affirmative action, this ensures that on average black employees will have worse job performance than employees from other groups. This finding is clear in both subjective and objective measures of job performance (typically clearer in the latter, actually) with absolutely massive datasets, e.g. carefully measured performance in the U.S. military.

    Yet when Nobelist James Watson said this in an informal way, this ‘inconvenient truth’ was viewed as a mark of intolerable racism. His comments about genetics playing a causative role did not enjoy the same overwhelming support, but scientists and professional societies that claimed there was no evidence were outright lying, as anonymous surveys indicate that a majority of intelligence researchers believe that genetics plays a role (the surveys also reveal that they consider it irresponsible for people like Arthur Jensen or James Watson to tell the public about it).

    Laypeople of privileged groups are predisposed to reject ‘subtle racism’ claims, whereas journalists and academics are predisposed to uncritically endorse them (or to keep silent otherwise), whether because most members of those professions are high-Openness to Experience social liberals in unusually cosmopolitan environments or because of the threat of professional punishment. But it’s the latter group that determines media coverage of the area, and if someone like Tetlock or Tierney wants to offset bias in the coverage and the science, enumerating the contrary evidence available, including strong and weak pieces of evidence, seems pretty reasonable. It’s not clear how strong Tetlock thinks the stereotype threat argument is, but even if he think it’s weak, that doesn’t mean he shouldn’t mention it: it’s still evidence, and other people take it quite seriously. There’s also a jujitsu element: the standard bias in journalism and academia favors both implicit association and stereotype threat explanations, so the people most likely to overhype the former are also most likely to overemphasize the latter, so illustrating the tension is likely to help counteract that bias.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: