NORMS OF EMOTIONAL VALENCE, AROUSAL, THREAT VALUE AND SHOCK VALUE FOR 80 SPOKEN FRENCH WORDS: COMPARISON BETWEEN NEUTRAL AND EMOTIONAL TONES OF VOICE

This paper presents a controlled database of 80 neutral, negative, positive and taboo spoken French words rated by 166 participants on scales for emotional valence, arousal, threat value and shock value. Ratings were provided for each word spoken in a neutral and in an emotionally congruent tone of voice. The data point to the importance of taking into account various emotional dimensions of a stimulus: although strongly correlated, these emotional dimensions cannot be mingled and their impact on emotional evaluation varies according to the emotional category of the word. This also holds true for the influence of the tone of voice in which the words are uttered.


Introduction
Many studies are concerned with the influence of the emotional content of stimuli on cognitive processes. Among them, several investigate the potential impact of the emotional valence of stimuli on attentional processes. For that purpose, various experimental paradigms were developed or adapted from existing ones. The dot probe task (e.g., MacLeod, Mathews, & Tata, 1986), the emotional Stroop task (e.g., Pratto & John, 1991) and the emotional variant of the attentional cuing paradigm (e.g., Fox, Russo, Bowles, & Dutton, 2001) are most frequently used. In these studies, pictures, faces, and also words are presented as emotional stimuli.
The use of verbal material in these studies has been criticised. Given their symbolic nature, written words may not threaten the participant as directly as do pictures (e.g., Mogg & Bradley, 1998). This may be one of the reasons why the influence of the emotional content of words on attentional processes is usually not observed in healthy people. However, the significant effects of the emotional content of pictures which have been reported in unselected participants (e.g., Waters, Lipp, & Spence, 2004) are also difficult to repli-cate (see Bar-Haim, Lamy, Pergamin, Bakermans-Kranenburg, & van IJzendoorn, 2007, for a meta-analysis).
We assumed that over and above than the symbolic or non-symbolic nature of the material, its modality of presentation could be crucial. As a matter of fact, in various situations of daily life, a source of danger is often not visible. Coherently, audition is usually considered to be the main alerting or early-warning system (Scharf, 1998), influencing not only auditory but also visual attention (Spence & Driver, 1997). Attentional orienting towards emotional sounds would thus be crucial. Henceforth, in experimental studies, the impact of the auditory presentation of emotional stimuli on attentional orienting ought to be explored. In particular, spoken words may be ecologically more relevant than written words, both because of their high frequency of use and because oral language is more ancient than written language in the history of the species and in ontogenetic development.
These assumptions have been supported by a recent study using auditory adaptations of the dot probe task (Bertels, Kolinsky, & Morais, under revision). We demonstrated that the presentation of emotional spoken words influenced attentional orienting in unselected participants, in conditions similar to those under which this influence was not observed with written words (e.g., MacLeod et al., 1986). The use of auditorily presented verbal material thus seems to open interesting research trails.
It would be useful for researchers in this domain to have at their disposal a controlled database of spoken words. Validated sets of stimuli for research on emotions are already available both in the visual linguistic (for French: Bonin, Méot, Aubert, Malardier, Niedenthal, & Capelle-Toczek, 2003;Messina, Morais, & Cantraine, 1989, and for English, see e.g., the ANEW, Bradley & Lang, 1999b) and non-linguistic domains (for faces : Ekman & Friesen, 1976, and for pictures : Lang, Bradley, & Cuthbert, 2008), and more recently in the auditory non-linguistic domain (for excerpts of music: Vieillard, Peretz, Gosselin, & Khalfa, 2008). To our knowledge, there is no such a database for emotional spoken words, at least in French, the language used in the experiments run by Bertels et al. (under revision). Therefore, creating a normative set of ecologically valid auditory linguistic stimuli was the aim of the present study.
Words found in the present database were selected on the basis of their a priori emotional valence, namely that they were negative, positive or neutral. Although debated, the idea that emotional valence is a determining dimension for the capture of attentional resources still prevails. Emotional valence has been identified as the most powerful measure of the emotional nature of the stimuli, explaining the biggest part of variance in affective responses (e.g., Lang, Greenwald, Bradley, & Hamm, 1993;Russell, 1978). Also, given their increasing use in recent studies, we added taboo words (e.g., erotic words and insults). To do so, we had four predefined categories each containing 20 words.
These four word types were matched according to their oral frequency and number of phonological neighbours (cf. Lexique 3.01, New, Pallier, Ferrand, & Matos, 2001), two lexical characteristics that are known to strongly contribute to word recognition and which might induce potential confounds. Indeed, Larsen, Mercer, and Balota (2006) showed that numerous interference effects linked to the emotional content of words in emotional Stroop tasks could in fact be related to lower frequency of use and to the number of neighbours (in the case of written words, orthographic neighbours) of the emotional words compared to the control (neutral) words.
In the present study, independent, unselected participants rated each selected word on one of four different emotional scales. One scale aimed at rating the emotional valence of the words so as to check for the validity of our predefined word categories. The rating had to be made on a continuum ranging from very negative to very positive, using the same method as previous studies regarding written words (e.g., Bonin et al., 2003;Messina et al., 1989). Although many researchers agree that emotional valence is a crucial dimension of affective stimuli (e.g., Barrett, 2006;Hellige, 1993), other emotional characteristics are often considered in the literature. For this reason, the 80 selected words were also rated on three related emotional scales.
In particular, given that both emotional valence and arousal have been considered to be the main dimensions of emotional affect (e.g., Bradley & Lang, 1999a;Russell, 1980) and that an increasing number of studies insist on the importance of the arousal level of the stimuli (e.g., Schimmack, 2005), the stimulating nature of the words was rated on another scale. However, even if arousal is recognised to be an important factor, it would not entirely account for some of the observed attentional effects linked to emotional words (see e.g., Mathewson, Arnell, & Mansfield, 2008). This might hold particularly true for taboo words, which are highly arousing but also shocking. One possibility is that the shocking nature of these words could explain -at least in part -the observed effects. For this reason, all the words were rated on a scale in which participants had to judge to what extent their meaning was shocking. Finally, a further scale aimed at rating the threat value of the words. It should be noted that threat and negative emotional valence are often mingled. However, not all negative words are necessarily threatening -some may evoke sadness, for example. It is therefore interesting to distinguish between these two emotional traits for each particular word.
In addition, we manipulated the emotional prosody of the words. In order to imitate the presentation of written words as closely as possible, about half of the participants rated the 80 words uttered in a neutral tone of voice. Nevertheless, one advantage of oral over written language is that emotional prosody allows one to emphasise the intrinsic emotional meaning of spoken words. Hence, the other participants rated words uttered in a tone of voice emotionally congruent with their emotional meaning, a presentation condition that might be ecologically more valid.

Participants
The participants were 166 first year students of the Université Libre de Bruxelles (116 women), ranging from 17 to 47 years (mean: 21). They were paid (120 of them) or were given course credits for their participation. All had spoken French for at least the last 10 years.
They were divided into eight groups, according to the scale on which they rated the stimuli (emotional valence, arousal, threat value, shock value) and to the emotional tone of voice on which the words were pronounced (neutral, emotionally congruent). There were 20 participants in each group, except in Groups 1 (24) and 5 (22) which rated the emotional valence and the shock value of words uttered in a neutral tone of voice, respectively.

Stimuli
The stimulus set consisted of 80 mono-or disyllabic words, a priori emotionally positive, negative, taboo and neutral (20 of each). These words were chosen from the relevant literature (Bonin et al., 2003;Messina et al., 1989) or were generated by collaborators.
On the basis of the online database Lexique 3.01 (New et al., 2001), the four sets of emotional words were matched according to their number of phonological neighbours, oral frequency and phonological uniqueness point.
The words were pronounced by a 25 years old French-speaking actress (average fundamental frequency: 167 Hz) in a neutral and in an emotional tone of voice which was congruent with the emotional meaning of the word. Naturally, tones of voice did not differ for neutral words, so that the same stimuli were used in both conditions. Each word was repeated three times in each tone of voice in order to find the best stimulus. This was selected by two or three independent judges. For the neutral tone of voice, they made an independent judgement as to which stimulus was pronounced in the most neutral way. For emotional tone of voice, they were asked to judge which stimulus was pronounced with the intonation that best fitted its emotional significance. When the two judges disagreed, a third one was asked to make the final decision.
Words were digitally recorded on a Sony MiniDisc and were then transferred to a Macintosh Powerbook G3 via the interface Digidesign DIGI 002 Rack. They were cleaned, normalised and synchronised with the Soundtool/ Digidesign 6.2.2. software. Mean word length was 613 ms for words uttered in a neutral tone of voice and 632 ms for words spoken in an emotionally congruent tone of voice. The whole set of words can be downloaded at http:// homepages.ulb.ac.be/~jbertels.

Procedure
Participants were tested individually or simultaneously, with one participant per experiment box in the same room. Each participant sat in front of a computer screen, wearing headphones. The session began with detailed instructions which were presented on the screen. The experimenter was present to answer potential questions. Participants were told that they would hear a word during each trial, and that they had to rate this word on a predetermined scale. Participants in groups 1 and 2 rated the emotional valence of words on a seven-point scale, ranging from (1) "very negative, unpleasant, disagreeable" to (7) "very positive, pleasant, agreeable". The terms defining the extremes of this scale were those used by Messina et al. (1989). Participants in groups 3 and 4 rated the arousal level of the stimuli on a seven-point scale, ranging from (1) "very calming, soothing" to (7) "very arousing, alerting". In both scales, participants were explicitly asked to use the "4" response when the word was not emotionally loaded or when it did not elicit a particular emotional activation, respectively. Participants in groups 5 and 6 rated the words according to their threat value on a five-point scale, ranging from (1) "not threatening" to (5) "very threatening". In the same vein, participants in groups 7 and 8 rated the shock value of the word on a five-point scale, ranging from (1) "not taboo, shocking, disturbing" to (5) "very taboo, shocking, disturbing, vulgar, rude, insulting". Participants in all groups were invited to use all response keys corresponding to intermediate values. Examples were provided and participants were reminded that a personal, subjective rating was required, and as such, correct or false answers did not exist.
Participants answered by pressing the appropriate digit on the keyboard of a Macintosh Powerbook or a Mini Mac. Stimulus presentation and timing in addition to data collection were controlled using the Psyscope 1.2.5. PPC software (Cohen, MacWhinney, Flatt, & Provost, 1993). No time limit was imposed. For each participant, trials were presented in a different random order. The experiment lasted for about 15 minutes.

Results
The mean rating scores obtained on each scale for each of the 80 words are presented in the Appendix.

Reliability
In order to estimate inter-rater reliability, two subgroups of participants were randomly formed within each of the eight groups (odd and even participants) and correlations between the ratings were calculated. All these correlation coefficients are between .94 and .98, p < .001.
Another way to estimate inter-rater reliability is to calculate the mean correlations between participants in each of the eight groups. Hence, for each group constituted by 20 participants, the mean inter-rater correlation was based on the 190 existing correlations between the 20 participants (20*(20-1)/2). As proposed by Rosenthal (1982; see also Hermans & De Houwer, 1994;Van der Goten, De Vooght, & Kemps, 1999), these mean correlations were used to estimate the effective inter-rater reliability, following the Spearman-Brown's formula. The eight inter-rater agreements were between .95 and .98. Given these results, it seems that our evaluations can be considered to be quite reliable.
Regarding emotional valence, correlations between items common to our study and the earlier studies by Messina et al. (1989, 19 items), Bonin et al. (2003, 10 items) and Bradley and Lang (1999, ANEW, 34 items) are .99, .98 and .96 respectively, p < .001. However, regarding arousal, correlations between items common to our study and the ANEW (Bradley & Lang, 1999b) is non-significant, r = .19, p > .10. Potential explanations for this finding are provided in the Discussion.

Analyses of variance
For each scale, a 4 (word type: negative/positive/taboo/neutral) x 2 (tone of voice: neutral/emotionally congruent) repeated measures analysis of variance (ANOVA) design was applied on ratings. In each analysis word type was an inter-item variable and tone of voice an intra-item variable. Average rating scores for emotional valence, arousal, threat value and shock value are presented for each word type in Tables 1, 2, 3 and 4, respectively.   In the analysis of the emotional valence ratings, a main effect of word type was observed, F(3, 76) = 697.38, p < .001, reflecting the fact that all word types differed from each other, with negative words leading to the lowest (most negative) ratings, followed by taboo, neutral and positive words, all ps < .001. Our preliminary categorisation of words thus appears to be well-founded on the basis of the emotional valence ratings. Interestingly, the taboo words were judged as less negative than the negative words used. The effect of tone of voice is also significant, F(1, 76) = 69.093, p < .001, and reflects the fact that words uttered in an emotionally congruent tone of voice were rated to be more negative overall than the words uttered in a neutral tone of voice. The interaction between these variables is also significant, F(3, 76) = 15.948, p < .001. Indeed, negative and taboo words were judged as more negative when pronounced in an emotionally congruent rather than neutral tone of voice, F(1, 19) = 10.292, p = .005 and F(1, 19) = 54.101, p < .001, while no difference was found for positive words, F > 1. In addition, neutral words endured a contextual effect: even if they were unchanged, they were rated as more negative when the emotional words among which they were presented were pronounced in an emotionally congruent tone of voice, F(1, 19) = 17.9, p < .001.
In the analysis of the arousal ratings, we observed a main effect of word type, F(3, 76) = 190.665, p < .001, reflecting higher arousal ratings for negative and taboo words (which did not differ from each other, p > .10) than for neutral and positive words, all ps < .001. The arousal ratings of neutral and positive words differed significantly, p < .001, with positive words rated as less arousing than neutral ones. The effect of tone of voice is also significant, F(1, 76) = 49.008, p < .001, and reflects the fact that words spoken in an emotionally congruent tone of voice received higher arousal ratings than words uttered in a neutral manner. The interaction between word type and tone of voice is also significant, F(3, 76) = 17.976, p < .001. Indeed, while negative and taboo words were rated similarly when the tone of voice was emotionally congruent, F < 1, negative words were rated as more arousing than taboo words when they were uttered in a neutral tone of voice, F(1, 39) = 17.028, p = .002. Coherently, the effect of tone of voice is significant for taboo, F(1, 19) = 68.686, p < .001, but not negative words, F < 1. As with taboo words, positive words were rated as more arousing when uttered in an emotionally congruent tone of voice, F(1, 19) = 30.77, p < .001. No difference as a function of tone of voice was observed for neutral words, F < 1.
In the analysis of the threat value ratings, the effect of word type is significant, F(3, 76) = 298.031, p < .001, and reflects the fact that negative words were rated as more threatening than taboo words, which were rated as more threatening than positive and neutral words, all ps < .001; positive and neutral words did not differ from each other, p > .10. Tone of voice did not significantly affect ratings overall, F(1, 76) = 3.044, p = .085, but the interaction between word type and tone of voice is significant, F(3, 76) = 14.287, p < .001. While taboo words were considered to be slightly more threatening when uttered in an emotionally congruent tone of voice, the opposite effect was observed for negative words, F(1, 19) = 4.659, p < .05 and F(1, 19) = 23.762, p < .001. No difference as a function of tone of voice was observed for positive and neutral words, F(1, 19) = 1.629, p > .10 and F < 1.
The analysis of the shock value ratings reveals an effect of word type, F(3, 76) = 350.829, p < .001. Taboo words were rated as more shocking than the other word types, all ps < .001; negative words were considered to be more shocking than positive and neutral words, both ps < .001; and positive and neutral words did not differ from each other, p > .10. The effect of tone of voice is also significant: words uttered in an emotionally congruent tone of voice were rated as more shocking than words uttered in a neutral tone of voice, F(1, 76) = 69.069, p < .001. The interaction between these factors is significant as well, F(3, 76) = 27.574, p < .001, and reflects the fact that the effect of tone of voice is only significant for negative and taboo words, F(1, 19) = 24.599 and 64.749, both ps < .001, not for positive and neutral words, both F < 1.

Correlation analyses
Correlation analyses were made on the collected scores. For each scale, word ratings on neutral and emotionally congruent tones of voice were strongly correlated (emotional valence: r = .969, arousal: r = .929, threat: r = .969, shock value: r = .985, all p < .01). For each tone of voice, the scores between the several scales were also strongly correlated, as illustrated in Figures 1 to 6  with part A relative to the neutral tone of voice, and part B to the emotional tone. Note that since the highest level of emotional valence corresponds to a positive emotional valence, correlations between emotional valence and ratings on the other scales are negative. These very high inter-scales correlations are apparently due, at least in part, to the fact that they were calculated combining all word types. There are, however, significant inter-scales correlations for some word types considered separately. The correlation between emotional and arousal ( Figure  1) is significant for both the neutral and the taboo words in the neutral tone (r = -.604, p = .005; r = -.545, p < .015) as well as in the emotional tone of voice (r = -.474, p < .04; r = -.807, p < .001); and for the negative words in the emotional tone only (r = -.471, p < .04). The correlation between emotional valence and threat ( Figure 2) is significant for both the negative and the taboo words in both tones of voice (neutral tone: r = -.464, p < .04 and r = -.59, p < .01); emotional tone: r = -.564, p = .01 and r = -.686, p = .001). The correlation between emotional valence and shock value ( Figure 3) is significant for the neutral words in both the neutral tone (r = -.461, p < .05) and the emotional tone (r = -.69, p = .001), and for the negative words in the emotional tone only (r = -.461, p < .05). The correlation between shock and threat value ratings (Figure 4) is significant for the negative words in both the neutral (r = .647, p < .01) and the emotional tone (r = .608, p < .01), and for the positive and taboo words in, respectively, the emotional (r = .613, p < .01) and the neutral tone (r = .751, p < .001). The correlation between arousal and shock ( Figure 5) is significant for the positive and negative words in the emotional tone (r = .515, p = .02; r = .617, p < .01) and for the taboo words in the neutral tone (r = .593, p < .01). Finally, the correlation between arousal and threat value ratings ( Figure 6) is significant for the negative and taboo words in both tones of voice (neutral tone: r = .641, p < .01, r = .645, p < .01; emotional tone: r = .87, p < .001, r = .752, p < .001). These data suggest that the negative and taboo words present a larger number of significant interscales associations (respectively, 9 and 8 of the 12 inter-scales correlations) than the neutral and positive words (4 and 2), respectively.

Discussion
The present study provides norms for four emotional dimensions of a corpus of 80 spoken French words uttered in a neutral and in an emotionally congruent tone of voice. A new normative set of ecologically valid auditory linguistic stimuli is thus now at the research community's disposal.
Four groups of 20 spoken words were initially selected on the basis of their a priori emotional valence (neutral, positive, negative) and taboo nature. Given the contribution of oral frequency, number of phonological neighbours and phonological uniqueness point to word recognition, the groups were matched according to these lexical characteristics. Each of the selected words was rated for its emotional valence, as well as on related emotional dimensions frequently considered in the literature, namely the arousal and the threat value. We added also the shock value, assuming that effects of taboo words would not be only due to their arousal level (see e.g., Mathewson et al., 2008), but could rather be explained, at least in part, by their intrinsic "tabooness".
Obviously, these four dimensions were highly correlated in the sample of words used. Although certainly due, at least in part, to the fact that they were calculated on all word types combined, we observed that some of these ratings were still correlated when considering, for a specific tone of voice, each word type separately. This was particularly true for negative and taboo words. On a practical level, this prevents the selection of words from our database on the basis of orthogonally varying dimensions. Also, this undermines the possibility of assessing the specific contribution of a particular emotional dimension of the words in an observed effect. Nevertheless, specific differences between the four predefined categories (neutral, negative, positive and taboo words) for each different scale point to the importance of taking into account various emotional dimensions of a word. Furthermore, in addition to demonstrating the effect of emotional prosody on the emotional judgments of spoken words, the observed ratings also stress differences between the four emotional categories regarding the influence of the tone of voice. When analysing the scatter plots further, one notes that uttering the words in an emotional rather than neutral tone of voice tended to distinguish negative and taboo words more clearly from neutral and positive words, particularly regarding emotional valence and arousal. Indeed, for emotional valence (in Figures 1B, 2B and 3B), negative and taboo words are clearly disjointed from neutral and positive words, whereas in Figures 1A, 2A and 3A there is some overlap; and for arousal this is apparent in Figures  1, 5 and 6. In contrast, the dissociation between these groups of emotional words is present for both shock (Figures 3, 4, 5) and threat (Figures 2, 4, 6) with both tones of voice.
Apart from these general observations, differences between taboo and negative words emerged. While taboo words were judged to be more shocking than negative words, the latter were judged as more threatening, more negative and more arousing (at least when uttered in a neutral tone of voice). These data support our thesis that shock value is an intrinsic characteristic of the taboo words we used and that arousal is probably not the main characteristic of taboo words responsible for their effects on attentional orienting (e.g., Bertels et al., under revision).
Still, whereas both the shock and threat value ratings of taboo words were higher when they were uttered in an emotionally congruent than in a neutral tone of voice, this was not the case for negative words: when uttered in an emotionally congruent tone of voice, they were considered to be more shocking but less threatening than when uttered in a neutral tone of voice. A possible explanation of these results is that a negative word uttered in a neutral tone of voice may evoke coldness. This would make such words even more threatening than when they were uttered in an emotionally congruent tone of voice. The dissonance between the negative emotional meaning and the neutral tone of voice could have created a stronger feeling of threat than the one evoked when the negative meaning and the tone of voice corresponded.
Looking at the scatter plots, it is striking that the overlap between negative and taboo words increases along the threat dimension when they are pronounced in an emotionally congruent tone of voice relative to a neutral one (see Figures 2, 4, 6). In contrast, the emotional tone of voice tends to increase the separation between the same word types in terms of shock value (see Figures 3, 4, 5). The latter effect is probably due to the stronger impact of the congruent emotional tone on the shock value ratings of taboo words as compared to negative words (see Table 4).
In the context of emotional words uttered in an emotionally congruent tone of voice, neutral words were judged as more negative but not as more arousing, threatening or shocking than when presented among emotional words uttered in a neutral tone of voice. Since they remained unchanged across the two conditions of tone of voice, neutral words thus endured a contextual effect, but only as regards emotional valence ratings.
Remarkably, positive words from our database were less arousing than the neutral words, whereas the inverse pattern of results is usually observed in normative data such as the ANEW (Bradley & Lang, 1999b) for written verbal material and the IAPS (Lang et al., 2008) for pictorial material. This can easily be explained by the fact that the arousal scale we used had two extremes with an intermediate unarousing, "null" level. At one extreme, words were judged to be calming and soothing, while at the other extreme they were rated as arousing and alerting (see Method). In the scale used in the ANEW and the IAPS, the unarousing level was in fact associated with the relaxed, calm level, also judged to be dull or sluggish, while at the other extreme words were judged to be arousing and stimulating. Hence, rather than being less arousing than neutral words, the positive words used here were in fact judged to be more calming, while the neutral words were mostly rated as eliciting no particular emotional activation (see Table 2).
As noted above, although there is considerable association between the four emotional dimensions contemplated in the present study, these results (even if inevitably dependent upon the specific set of words used) stress the relevance of prosody for the words' emotional evaluation, as well as the need to distinguish between the various dimensions that subtend emotional judgments.