Thursday, June 08, 2017

Terrorism and the Implicit Association Test

Induced Stereotyping?

Imagine that you're riding on a very crowded bus in a busy urban area in the US. You get on during a shift change, when a new driver takes over for the old one. The new driver appears to be Middle Eastern, and for a second you have a fleeting reaction that the situation might become dangerous. This is embarrassing and ridiculous, you think. You hate that the thought even crossed your mind. There are 1.8 billion Muslims in the world. How many are radical Islamist extremists? For example, in the UK at present, the number comprises maybe 0.00000167% of all Muslims? 1

Language matters.

Theresa May:
“First, while the recent attacks are not connected by common networks, they are connected in one important sense. They are bound together by the single, evil ideology of Islamist extremism that preaches hatred, sows division, and promotes sectarianism. 

It is an ideology that claims our Western values of freedom, democracy and human rights are incompatible with the religion of Islam. It is an ideology that is a perversion of Islam and a perversion of the truth.”

Donald Trump:
“That means honestly confronting the crisis of Islamist extremism and the Islamist terror groups it inspires. And it means standing together against the murder of innocent Muslims, the oppression of women, the persecution of Jews, and the slaughter of Christians.
. . .


In both of these cases, the world leaders did acknowledge that Islamist extremism is not the same as the religion of Islam. Nonetheless, in terms of statistical co-occurrence in the English language, the root word Islam- is now associated with all that is bad and evil in the world. Could the constant exposure to news about radical Islamist terrorism and Trump's proposed Muslim Ban result in an involuntary or “forced” stereotyping in the bus scenario above?

A recent study found that semantics derived automatically from language corpora contain human-like biases, which means that machines (which do not have cultural stereotypes) become “biased” when they learn word association patterns from large bodies of text, such as Google News. The authors used a word embedding algorithm called Global Vectors for Word Representation (GloVe) to improve the performance of the machine learning model. As a measure of human bias, they used the popular implicit association test (IAT), from which they developed the Word-Embedding Association Test (WEAT). Instead of response times (RTs) to a specific set of words, WEAT used the distance between a set of vectors in semantic space. The authors were able to replicate the associations seen in every IAT they tested (Caliskan et al., 2017), suggesting:
The number, variety, and substantive importance of our results raise the possibility that all implicit human biases are reflected in the statistical properties of language.

Arab-Muslim Implicit Association Test

Because of the relationship between word associations and implicit bias, I decided to take the Arab-Muslim IAT at Project Implicit, an organization interested in “implicit social cognition — thoughts and feelings outside of conscious awareness and control.” This definition seemed to fit with the bus scenario, which involved an impulse to profile the driver based on a rapid evaluation of perceived ethnicity.

In the Arab-Muslim IAT, the participant classifies words as good (e.g, Fantastic, Fabulous) or bad (e.g, Horrible, Hurtful), and proper names as Arab Muslim (e.g., Akbar, Hakim) or “Other People” (e.g, Ernesto, Philippe, Kazuki).2 The bias is revealed when you have to sort both of these categories at the same time. Are you slower when Good/Arab Muslim are mapped to the same key, compared to when Bad/Arab Muslim are mapped to the same key? (and vice versa).

My results are below.

- click on image for a larger view -

I showed a moderate automatic preference for Arab Muslims over Other People. But this wasn't completely unique compared to the population of 327,000 other participants who have taken this test:

The summary of other people's results shows that most people have little to no implicit preference for Arab Muslims compared to Other People - i.e., they are just as fast when sorting good words and Arab Muslims than sorting good words and Other People.”

The aggregate results above covered a period of 11.5 years ending in December 2015. The strength of semantic associations between words can vary over time and contexts, so we can wonder if this has shifted any in the last year. In addition, different results have been observed when faces were used instead of names, and when a better list of “Other People” names was used to specify ingroup vs. outgroup (see explanation in footnote #2).

A Muslim-Terrorism test has in fact been developed by Webb et al. (2011). They used a variant of the IAT (the GNAT) with Muslim names (e.g., Abdul, Ali, Farid, Khalid, Tariq), Scottish names (e.g., Alistair, Angus, Douglas, Gordon, Hamish), terrorism-related words (e.g., attack, bomb, blast, explosives, threat) and peace-related words (e.g., friendship, harmony, love, serenity, unity). In an interesting twist, the authors varied “implementation intentions” to flip the Muslim-Terrorism test to the Muslim-Peace test in half of the subjects:
Following the practice trials, one-half of the participants (implementation intention condition) were asked to form an implementation intention to help them to respond especially quickly to Muslim names and peace-related words. Participants were asked to tell themselves ‘If Muslim names and peace are at the top of the screen, then I respond especially fast to Muslim words and peace words!’. Participants were asked to repeat this statement several times before continuing with the experiment. The other half of the participants (standard instruction condition) were given no further instructions.

I actually discovered this strategy on my own in 2008, when my IAT results revealed I was Human AND Alien and NEITHER Dead NOR Alive.

And indeed, the Muslim-Peace instructions neutralized the strong Muslim-Terrorism association seen in the control participants Webb et al. (2011).

Calvin Lai and colleagues conducted a high-powered series of experiments showing that instructions such as implementation intentions and faking the IAT can shift implicit racial biases (Lai et al., 2014), but these interventions are short-lived (Lai et al., 2016).

I wrote about the former study in 2014: Contest to Reduce Implicit Racial Bias Shows Empathy and Perspective-Taking Don't Work. Failed interventions all tried to challenge the racially biased attitudes and prejudice presumably measured by the IAT. These interventions are below the red line in the figure below.

- click on image for a larger view -

Figure 1 (modified from Lai et al, 2014). Effectiveness of interventions on implicit racial preferences, organized from most effective to least effective. Cohen’s d = reduction in implicit preferences relative to control; White circles = the meta-analytic mean effect size; Black circles = individual study effect sizes; Lines = 95% confidence intervals around meta-analytic mean effect sizes. IAT = Implict Association Test; GNAT = go/no-go association task.

The major message here is that top-down cognitive control processes can affect thoughts and feelings that are purportedly outside of conscious awareness — and can apparently override semantic associations that are statistical properties of language obtained from a large-scale crawl of the Internet (containing 840 billion words)!

Now whether the IAT actually measures implicit bias is another story...

ADDENDUM (June 11 2017): Prof. Joanna J. Bryson, a co-author on the machine learning/semantic bias paper, wrote a very informative blog post about this work: We Didn't Prove Prejudice Is True (A Role for Consciousness).


1 I cannot imagine what it's like to be a survivor of the recent Manchester and London attacks, and my deepest condolences go out to the families who have lost loved ones

2 Notice I put “Other People” in quotes. That's because the names are not all from the same category (country/ethnicity)  Latino, French, and Japanese in the examples above. This lack of uniformity could slow down RTs for the “Other People” category. A better alternate category would have been all French names, for instance. Or use common European-American names to differentiate ingroup (Michael, Christopher, Tyler) vs. outgroup (Sharif, Yousef, Wahib)


Caliskan A, Bryson JJ, Narayanan A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science 356(6334):183-186.

Lai CK, Marini M, Lehr SA, Cerruti C, Shin JE, Joy-Gaba JA, Ho AK, Teachman BA, Wojcik SP, Koleva SP, Frazier RS, Heiphetz L, Chen EE, Turner RN, Haidt J, Kesebir S, Hawkins CB, Schaefer HS, Rubichi S, Sartori G, Dial CM, Sriram N, Banaji MR, Nosek BA. (2014). Reducing implicit racial preferences: I. A comparative investigation of 17 interventions. J Exp Psychol Gen. 143(4):1765-85.

Lai CK, Skinner AL, Cooley E, Murrar S, Brauer M, Devos T, Calanchini J, Xiao YJ, Pedram C, Marshburn CK, Simon S, Blanchar JC, Joy-Gaba JA, Conway J, Redford L, Klein RA, Roussos G, Schellhaas FM, Burns M, Hu X, McLean MC, Axt JR, Asgari S, Schmidt K, Rubinstein R, Marini M, Rubichi S, Shin JE, Nosek BA. (2016). Reducing implicit racial preferences: II. Intervention effectiveness across time. J Exp Psychol Gen. 145(8):1001-16.

Webb TL, Sheeran P, Pepper J. (2012). Gaining control over responses to implicit attitude tests: Implementation intentions engender fast responses on attitude-incongruent trials. Br J Soc Psychol. 51(1):13-32.

Subscribe to Post Comments [Atom]


Post a Comment

Links to this post:

Create a Link

<< Home

eXTReMe Tracker