Skip Navigation


Cerebral Cortex Advance Access originally published online on March 8, 2006
Cerebral Cortex 2007 17(2):339-352; doi:10.1093/cercor/bhj151
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
17/2/339    most recent
bhj151v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Beaucousin, V.
Right arrow Articles by Tzourio-Mazoyer, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Beaucousin, V.
Right arrow Articles by Tzourio-Mazoyer, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

FMRI Study of Emotional Speech Comprehension

Virginie Beaucousin1, Anne Lacheret2, Marie-Renée Turbelin3, Michel Morel2, Bernard Mazoyer1,3 and Nathalie Tzourio-Mazoyer1

1 Groupe d'Imagerie Neurofonctionnelle, UMR6194, Centre National de la Recherche Schientifique/CEA/Universités Caen et Paris 5, France, 2 Centre de Recherches Inter-langues sur la Signification en Contexte, FRE 2805, Centre National de la Recherche Schientifique/Université Caen, 3 IRM CHU Caen, Institut Universitaire de France

Address correspondence to Nathalie Tzourio-Mazoyer, UMR6194 GIP Cyceron, BP 5229, 14074 Caen Cedex, France. Email: tzourio{at}cyceron.fr.


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix A: Examples of...
 References
 
Little is known about the neural correlates of affective prosody in the context of affective semantic discourse. We used functional magnetic resonance imaging to investigate this issue while subjects performed 1) affective classification of sentences having an affective semantic content and 2) grammatical classification of sentences with neutral semantic content. Sentences of each type were produced half by actors and half by a text-to-speech software lacking affective prosody. Compared with neutral sentences processing, sentences with affective semantic content—with or without affective prosody—led to an increase in activation of a left inferior frontal area involved in the retrieval of semantic knowledge. In addition, the posterior part of the left superior temporal sulcus (STS) together with the medial prefrontal cortex were recruited, although not activated by neutral sentences classification. Interestingly, these areas have been described as implicated during self-reflection or other's mental state inference that possibly occurred during the affective classification task. When affective prosody was present, additional rightward activations of the human-selective voice area and the posterior part of STS were observed, corresponding to the processing of speaker's voice emotional content. Accurate affective communication, central to social interactions, requires the cooperation of semantics, affective prosody, and mind-reading neural networks.

Key Words: emotion • fMRI • language • prosody • theory of mind


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix A: Examples of...
 References
 
Emotional verbal communication is a fundamental element of human relationships in which comprehension emerges from the processing of both linguistic and pragmatic information. Whereas linguistic information includes the integration of the meaning of words (semantics) and sentences (syntax), pragmatic information concerns the processing of gestures, facial expressions, and emotional prosody that accompanies the oral expression of language. How do semantic and emotional prosodic contents interact in the brain to complete an accurate comprehension of emotional discourse?

The neural basis of neutral speech understanding is well documented, and numerous reports have led to a quite clear definition of the left hemispheric frontal and temporal areas involved in semantic and syntactic processing (for a review, see Vigneau and others 2006Go). Regarding the question of the neural basis of emotional speech processing, studies are scarce. Only one study has investigated the semantic integration of emotional discourse at the word level (Beauregard and others 1997Go), whereas most reports have focused on the neural implementation of emotion conveyed by prosody. A survey of these studies shows that the comprehension of emotional prosody recruits both the right inferior frontal and temporal areas (Mitchell and others 2003Go) together with homologous leftward regions that are known to process the linguistic aspects of language (Wildgruber and others 2002Go, 2004Go, 2005Go; Kotz and others 2003Go; Grandjean and others 2005Go). Such evidence suggests the involvement of syntactico-semantic areas during emotional prosodic processing and questions the specificity of right temporal areas for emotional prosodic processing, a conclusion based on the observations of aprosodic patients (Ross 1981Go).

However, before a definite conclusion can be drawn, the possible impact of the paradigms used in the functional imaging studies on the results obtained must be considered. In these reports, subjects were generally presented with auditory stimuli having a semantic content inconsistent with emotional prosody (e.g., unintelligible stimuli [Kotz and others 2003Go], sentences constructed with pseudowords [Grandjean and others 2005Go], or sentences with neutral content [Wildgruber and others 2002Go, 2004Go, 2005] spoken with emotional prosody). Such a paradigm, designed to remove semantic processing from the cognitive task, could have paradoxically led to an increase in the semantic demand. This proposed effect is supported by the observation that the more the speech is unintelligible, the more the activity in the left perisylvian semantic areas increases (Meyer and others 2002Go; Kotz and others 2003Go).

The effect could also attest the existence of a close cooperation between semantic and prosodic systems necessary to perform accurate speech comprehension. As a matter of fact, strong interactions do exist between prosodic and semantic systems during emotional speech comprehension: a drastic reduction in performance occurs in normal volunteers during affective categorization of sentences lacking syllabification but including affective prosody (Lakshminarayanan and others 2003Go), whereas aprosodic patients dramatically improve their scores during affective speech comprehension to 70% correct answers (CA) when affective sentences include congruent semantic content (Bowers and others 1987Go).

To achieve the goal of disentangling linguistic and prosodic neural components and to further investigate their relative involvement during affective speech comprehension, we elaborated a functional magnetic resonance imaging (fMRI) paradigm. To uncover the areas processing affective prosody, we compared the affective classification of sentences spoken by actors with the affective classification of sentences produced by Kali, a text-to-speech software that generates sentences from natural spoken syllables, which includes grammatical prosody but is devoid of affective prosody (Morel and Lacheret-Dujour 2001Go). Sentences produced by actors and Kali were equivalent in terms of affective syntactico-semantic content. In order to explore the neural correlates of affective sentence comprehension, we compared the regions involved in affective classification of sentences having an affective (i.e., emotional and attitudinal) semantic content with the areas obtained during the grammatical classification of sentences having neutral semantic content. This comparison was operated by means of a conjunction on sentences enounced by actors and Kali to provide the neural network of affective semantic content comprehension independently of the presence of affective prosody.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix A: Examples of...
 References
 
Elaboration of the Stimuli

Sentence Construction

The initial corpus was composed of 120 sentences with emotional semantic content (anger, sadness, or happiness), 120 sentences with attitudinal semantic content (obviousness, doubt, or irony), and 80 neutral sentences (see Appendix A, for examples of sentences). The construction of the sentences was identical for all types of sentences: length, word frequency, and imageability were matched across conditions. The words that composed the sentences consisted of 2 or 3 syllables and were frequent (on a text sample of 100 million words, the words selected appeared at least 2000 times) and highly imageable (scored 5 out of 6) as assessed in the Brulex database (Content and others 1990Go). Although most of the sentences had a simple syntactic structure, including subject, verb, and complement, 107 sentences have a more complex structure with an additional complement. These more complex sentences were equally represented in each category, corresponding to 5% of the total number of sentences. The length of the sentences was equivalent across conditions (mean number of words per sentence including functional words: 9.2 ± 2 for neutral, 10 ± 2 for happiness, 10.4 ± 2 for sadness, 11.4 ± 3 for anger, 9.1 ± 2 for doubt, 10.2 ± 2 for obviousness, 10.7 ± 2 for irony).

Sentence Stimuli Production

All sentences were recorded twice. They were either enounced by actors with appropriate grammatical and affective prosody or produced by a text-to-speech synthesis that constructs sentences from naturally spoken syllables, that includes grammatical prosody but lacks affective prosody (Kali) (Morel and Lacheret-Dujour 2001Go). In order to avoid the possible confound of speaker's gender (Wildgruber and others 2002Go), half of the sentences produced by Kali were pronounced with a female voice whereas the remaining half were produced with a male voice. Similarly, half of the sentences were enounced by an actor and the other half by an actress.

Sentence Selection

A pilot experiment was conducted with 16 subjects (9 men) to select the affective and neutral sentences to be used in the functional study. The criteria were that 1) affective (emotional and attitudinal) sentences enounced by actors had to be accurately classified by all the subjects and 2) affective and neutral sentences produced by Kali had to be understood by all the subjects. Starting with 320 sentences, 180 affective sentences were chosen, 30 per category, including 15 produced by Kali and 15 enounced by actors. In addition, a set of 90 neutral sentences was selected.

Subjects

Twenty-three young healthy volunteers participated in the fMRI study, 11 men and 12 women (23.3 ± 3 years). All were right handed (Edinburgh score = 88.7 ± 13 [Oldfield 1971Go]), university students (4 years at university ± 2 years) reported French as their mother tongue and were selected as having a typical leftward hemispheric asymmetry on functional images. They had no auditory deficit, and their T1-weighted magnetic resonance images were free from abnormalities. All gave informed written consent to the study, which was accepted by our local ethical committee (Comité Consultatif de Protection des Personnes pour la Recherche Biomédicale de Basse-Normandie no. 99/36).

Procedure

Prior to the experiment, subjects were given instructions and training.

During the fMRI acquisitions, the subjects were submitted to 8 runs: 4 runs of affective classification and 4 runs of grammatical classification. During the affective classification task, the subjects had to classify the affective sentences that they heard into 1 of the 3 categories. The emotional sentences had to be classified as happy, angry, or sad, whereas attitudinal sentences had to be classified as expressing doubt, irony, or obviousness. Both affective classifications (emotional and attitudinal) were performed in separate runs twice: once with sentences enounced by the actors, that is, with affective prosody (2 runs AffAct) and once with different sentences produced by Kali, that is, devoid of affective prosody (2 runs AffKali).

During the grammatical classification, the subjects had to classify the subject of the sentences according to its form: first, second, or third person. This task was performed on neutral sentences in 2 different runs: in one run the sentences were enounced by actors (GrAct) and in the other they were produced by Kali (GrKali). Both runs were replicated with different sentences to further allow independent statistical comparison with the 4 runs of affective classification.

The subjects were given first the 4 runs containing sentences produced by Kali (2 AffKali and 2 GrKali) to avoid an influence of the affective prosody carried by the sentences enounced by actors. Within each session of 4 runs, pseudorandom presentation was used to avoid a confounding effect of order.

Outside the scanner, subjects answered a postsession questionnaire to determine the strategy they used to perform the affective classification. They were asked whether they used semantic (meaning of the words), syntactic (structure of the sentences), or prosodic cues (intonation) and whether they had rehearsed the sentences produced by Kali or enounced by actors.

Experimental Design and Apparatus

A block design was constructed for the present paradigm. Each of the 8 runs lasted 6 min 36 s and began with 60 s of a control task consisting in the detection of 9 beeps, presented through earphones at random interstimuli intervals. Subjects had to press alternatively 1 of the 3 buttons on a pad. This block was followed by 5 blocks of classification task lasting 34 s alternating with 5 blocks of control task that lasted 32 s. During each block of classification task, subjects listened to 9 sentences per block, each lasting about 3 s (Fig. 1). Answers and response time (RT), limited to 1 s, were collected using a pad with 3 buttons corresponding to the 3 choices proposed to the subjects in each classification task. The answers were assigned to the key following the alphabetic order of the categories to give the subject a mnemonic mean (e.g., in the emotional classification, anger was assigned to key 1, happiness to key 2, and sadness to key 3). Presentation of stimuli and recording of the responses were achieved through a computer equipped with the software SuperLabTM Pro version 2.0 (Cedrus, http://www.superlab.com/papers). Because of technical reasons, responses could not be acquired for 3 male participants.


Figure 1
View larger version (83K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1 Spectrograms and pitch contours. Frequency (left ordinate) and fundamental frequency (white line, right ordinate) are given in function of time (abscissa, ms) for a sentence with affective semantic content ("Super, j'ai gagné beaucoup d'argent au loto"/"Great, I've won lot of money on the lotto") (A) enounced by an actress and (B) produced by the female voice of Kali, a text-to-speech software, and for a sentence with neutral semantic content ("Le cheval court dans la prairie"/"The horse runs in the meadow") (C) enounced by an actor and (D) produced by the male voice of Kali.

 
Analysis of Behavioral Data

The percentage of CA and RTs for CA (RT, ms) were recorded during the fMRI acquisition in each classification task. Eight men out of 11 and 12 women were included in this analysis (scores for 3 men were missing due to technical problems). Kolmogorov and Smirnov test was performed to assess whether the variables' distribution differed from normality, which was not the case (CA for neutral sentences {chi}2 = 3.8, P = 0.3, for affective sentences {chi}2 = 1.7, P = 0.8; RT for neutral sentences {chi}2 = 0.7, P > 0.99, for affective sentences {chi}2 = 0.7, P > 0.99). An analysis of variance (ANOVA) with repeated measures was thus conducted with 2 factors: Task (affective vs. grammatical classification) and Voice (actors vs. Kali). Post hoc comparisons were performed using paired t-tests with a Bonferroni correction.

Analysis of Images

Acquisition of Images

Magnetic resonance imaging (MRI) acquisitions were conducted on a GE Signa 1.5-T Horizon Echospeed scanner (General Electric, BUC, France). The session started with 2 anatomical acquisitions. First, a high-resolution structural T1-weighted sequence (T1-MRI) was acquired using a spoiled gradient-recalled sequence (SPGR-3D, matrix size = 256 x 256 x 128, sampling = 0.94 x 0.94 x 1.5 mm3) to provide detailed anatomic images and to define the location of the 32 axial slices to be acquired during both the second anatomical acquisition and the functional sequences. The second anatomical acquisition consisted of a double echo proton density/T2-weighted sequence (PD-MRI/T2-MRI, matrix size = 256 x 256 x 32, sampling = 0.94 x 0.94 x 3.8 mm3).

Each of the 8 functional runs consisted of a time series of 66 echo-planar T2*-weighted volumes (blood oxygen level-dependent [BOLD], time repetition = 6 s, echo time = 60 ms, anisotropic fraction = 90°, sampling = 3.75 x3.75 x 3.8 mm3). To ensure the signal stabilization, the first 3 BOLD volumes were discarded at the beginning of each run.

Preprocessing of Functional Images

The preprocessing was built on the basis of SPM99b subroutines ([Friston and others 1995Go; Ashburner and Friston 1999Go], AIR5.0 [Woods and others 1992Go], Atomia [Verard and others 1997Go]) locally developed, and encapsulated in a semiautomatic processing pipeline. The preprocessing included 9 steps: 1) correction for differences in BOLD image acquisition time between slices; 2) rigid spatial registration of each of the BOLD volumes onto the fourth BOLD volume of the first acquired run (BOLD4); 3) computation of the spatial rigid registration and resampling matrices from BOLD4 to T2-MRI and PD-MRI to T1-MRI; 4) computation of the nonlinear registration matrix for stereotaxic normalization of the T1-MRI on the Montreal Neurological Institute T1-weighted templates (T1-MNI) (Collins and others 1994Go) (SPM99b stereotaxic normalization with 12-parameter rigid body transformations, 7 x 8 x 7 nonlinear basis functions, 12 nonlinear iterations, medium regularization, bounding box in between –90 to +91 mm left–right, –126 to 91 mm back–front, and –72 to 109 mm feet–head directions, sampling 2 x 2 x 2 mm3); 5) combination of the matrices computed at the previous 2 steps, visual checking and optional optimization of the EPI4 (echo-planar imaging 4) to T1-MNI registration in the stereotaxic space; 6) spatial resampling of each BOLD volume into the T1-MNI stereotaxic space; 7) spatial smoothing of each BOLD volume by a Gaussian filter (full width half minimum = 8 x 8 x 8 mm3); 8) high-pass filtering (cut-off of 0.0102 Hz) of each voxel time course; and 9) normalization of the voxel values by the average of its value in the course of the 2 runs (i.e., across time course).

Statistical Analysis of Functional Images

The functional data were analyzed and integrated in a statistical model by the semiautomatic software SPM99b (Wellcome Department of Cognitive Neurology, www.fil.ion.ucl.ac.uk/spm/).

The individual data consisted of 8 contrast maps that presenting a BOLD signal increase covarying with the cognitive task compared with the control task (beep detection). These 8 contrast maps corresponding to 2 runs of grammatical classification on sentences produced by Kali, 2 runs of grammatical classification on sentences enounced by the actors, 2 runs of affective classification, 1 on emotional and 1 on attitudinal sentences produced by Kali, and 2 runs of affective classification, 1 on emotional and 1 on attitudinal sentences enounced by actors. Then a second-level analysis was performed including, for each subject, the 8 BOLD contrast maps. Because no significant difference was observed between the attitudinal and emotional runs at 0.05 corrected threshold for multiple comparisons, these 8 runs were collapsed into 4 contrast maps in a second-level analysis: one corresponding to the mean of emotional and attitudinal classification of sentences enounced by actors minus beep detection (AffAct); one corresponding to the mean of emotional and attitudinal classification of sentences produced by Kali minus beep detection (AffKali); one corresponding to the mean of both grammatical classification tasks performed on sentences enounced by actors minus beep detection (GrAct); and one corresponding to the mean of grammatical classification tasks performed on sentences produced by Kali minus beep detection (GrKali). In the second-level analysis, the following contrasts were computed:

  1. (GrAct–GrKali) and (GrKali–GrAct) to evaluate the effect of the kind of speaker (P ≤ 0.001 uncorrected threshold).
  2. [(GrAct) {cap} (GrKali)]: conjunction analysis of the grammatical classifications to evidence neutral sentence comprehension areas (0.0025 corrected thresholds for multiple comparisons, corresponding to 0.05 per contrast).
  3. [(AffAct–GrAct) {cap} (AffKali–GrKali)]: conjunction of the "affective minus grammatical classification" contrasts obtained when the sentences were produced by Kali (AffKali–GrKali) and enounced by actors (AffAct–GrAct) to evidence the areas involved in affective sentence comprehension (0.0025 corrected threshold, corresponding to 0.05 per contrast).
  4. [AffAct–AffKali] to uncover areas dedicated to affective prosody processing (P ≤ 0.001 uncorrected).

Hemispheric asymmetries of these networks were evaluated, thanks to a whole-brain approach. First, we computed asymmetrical contrast maps resulting from the subtraction of individual flipped contrast maps in their x axis with their corresponding nonflipped maps. This resulted in a map per subject and per condition corresponding to the difference between left and right BOLD value in each voxel of the left side of the maps and right minus left BOLD value on the right side of the maps. Then a second-level analysis was performed on these asymmetrical contrast maps with the same design as the one we used for BOLD variations contrast maps. The asymmetries during AffAct were also investigated (P ≤ 0.001 uncorrected).

Lastly, the BOLD variations for the local maxima detected as significant in a given contrast were plotted for every task to further characterize their activation profile in the 4 conditions.


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix A: Examples of...
 References
 
Behavioral Data

The ANOVA evidenced a significant Task x Voice interaction concerning both the number of CA (F19 = 89.6, P < 0.0001, Fig. 2A) and the RT (F19 = 19.9, P = 0.0006, Fig. 2B).


Figure 2
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2 Behavioral results. (A) Average percentage of CA and (B) mean RTs (± standard deviation, ms) during affective classification of sentences with affective semantic content enounced by actors (AffAct, dark gray bar) or produced by Kali (AffKali, light gray bar), during the grammatical classification of sentences with neutral semantic content spoken by actors (GrAct, black bar) or produced by Kali (GrKali, white bar, ***P ≤ 0.001).

 
This interaction was related to the fact that classification of affective sentences (affective classification) enounced by actors with affective prosody (AffAct) was performed faster and with greater accuracy than when this task was performed on sentences produced by Kali devoid of affective prosody (AffKali) (RT: AffAct = 447 ± 71 ms, AffKali = 510 ± 66 ms, t19 = 5.3, P < 0.0001; CA: AffAct = 84 ± 9%, AffKali = 69 ± 9%, t19 = –7.9, P < 0.0001), whereas this was not the case with the grammatical classification task. In other words, affective classification was easiest to perform in the presence of affective prosody.

Indeed the Voice x Task interaction also seated into the fact that such an effect of Voice was not found during the grammatical classification of neutral sentences (this task will further be called the "grammatical classification"): no significant difference was found whether the sentences were enounced by actors (GrAct) or produced by Kali (GrKali) (CA: GrAct = 87 ± 8%, GrKali = 86 ± 6%, paired t-test t19 = –0.9, P = 0.36; RT: GrAct = 385 ± 82 ms, GrKali = 384 ± 89 ms, t19 = 0.4, P = 0.7).

Note that a significant Task main effect was evidenced: for both types of Voice, better performances were achieved during grammatical classification than during affective classification in terms of CA (F19 = 43.7, P < 0.0001) and RT (F19 = 50.4, P < 0.0001). A significant main effect of Voice was also observed independently of the task: greater CA (F19 = 31.7, P < 0.0001) and faster responses were found (RT: F19 = 12.9, P = 0.002) when the sentences were enounced by actors than when they were produced by Kali.

These results appeared very coherent with subjects that reported in the postsession questionnaire that the grammatical classification was the easiest to perform and that the affective sentences uttered by actors were easier to classify than affective sentences produced by Kali. All subjects indicated that, during affective classification, they used intonation cues to assess the sentences‘ affective content when it was present, whereas they relied on the affective verbal content of the sentences to complete the task when sentences missed affective prosody (produced by Kali). In addition, in the presence of affective prosody (sentences enounced by actors), 18 subjects out of 23 (78%) still used sentences’ verbal content in addition to intonation. Note that 11 subjects (48%) happened to rehearse the sentences, whatever the speaker, to complete the affective classification.

Functional Imaging Results

Grammatical Classification of Neutral Sentences

The comparisons, either (GrKali–GrAct) or (GrAct–GrKali), detected no impact specific to the text-to-speech software or the actors on the cerebral network involved during grammatical classification (even when lowering the threshold to 0.001 uncorrected for multiple comparisons).

The conjunction analysis of the grammatical classification tasks performed on sentences uttered by actors and Kali [(GrAct–beep detection) {cap} (GrKali–beep detection)] revealed massive leftward activations in the temporal, frontal, and parietal lobes (Fig. 3, Table 1). In the left temporal lobe, activations were identified in the superior temporal sulcus (STS) and superior temporal gyrus (STG), extending to Heschl's gyrus, the planum temporale, and to the posterior part of the middle temporal gyrus. In the frontal lobes, the inferior frontal gyrus (IFG), the precentral gyrus, and the supplementary motor area (SMA) were activated. This network also included the parietal lobe stretching from the postcentral to the superior parietal gyrus. The calcarine sulcus, putamen, thalami, and cerebellar cortex also showed BOLD signal increase.


Figure 3
View larger version (50K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 3 Cortical network engaged during the grammatical classification of sentences with neutral verbal content regardless of speaker. Cortical areas significantly activated during the grammatical classification of sentences produced either by Kali or by actors compared with beep detection are projected on the left (L) and right (R) hemisphere of MNI reference brain (conjunction analysis given at 0.0025 corrected threshold for multiple comparisons). Red scale is for BOLD signal variation; blue scale is for significantly asymmetrical BOLD variations in each hemisphere.

 

View this table:
[in this window]
[in a new window]

 
Table 1 Cortical areas implicated during grammatical classification of neutral sentences

 
Although BOLD analysis evidenced mirror activations in the right hemisphere, the direct comparison of left and right activations (thanks to the asymmetrical contrast maps) showed a significant leftward lateralization of the activated areas, except for the cerebellar cortex, which instead was asymmetrical to the right.

Neural Substrate of Affective Sentence Comprehension

To identify the network implicated in affective sentence comprehension independently of the presence of affective prosody, we computed a conjunction analysis of the differences between the affective and the grammatical classification obtained when the sentences were spoken by actors and when the sentences were produced by Kali [(AffAct–GrAct) {cap} (AffKali–GrKali)].

The clusters, where greater activity was observed during affective classification than during grammatical classification, could be split according to the profile of their BOLD signal variation calculated at the local maximal peak of activity as the mean BOLD values in each condition (Fig. 4, Table 2).


Figure 4
View larger version (40K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 4 Brain areas more activated by affective than grammatical classification. Conjunction analysis of the affective minus grammatical classification of sentences enounced by actors (respectively, AffAct and GrAct) with the same contrast on sentences produced by Kali (respectively, AffKali and GrKali) overlaid on MNI-referenced brain template (P ≤ 0.0025 corrected threshold for multiple comparisons). Areas evidenced by this conjunction and thus showing a significant larger activity when the sentences included an affective content whatever the presence of affective prosody than during grammatical classification of neutral sentences were located in the pre-SMA, the MF1, the left IFG (L IFG), and the left pSTS (L pSTS). Bar charts provide the average BOLD signal variation during each condition compared with the beep detection reference task at the local maximal peak of activity (error bars correspond to standard error of the mean; peaks coordinates are given in stereotaxic coordinates in mm; *P < 0.05, **P < 0.01 correspond to the results of one sample t-tests comparing the BOLD signal variation during each conditions with beep detection reference task; a.u., arbitrary unity; AffAct, red bar; AffKali, pink bar; GrAct, blue bar; GrKali, purple bar).

 

View this table:
[in this window]
[in a new window]

 
Table 2 Cortical areas showing higher activity during affective classification of sentences containing or not affective prosody than during grammatical classification

 
A first set of areas was composed of the anterior and inferior part of the bilateral IFG, the bilateral anterior insula, the pre-SMA (y > 26 mm), the subcortical areas (left thalamus and right caudate nucleus), and the right cerebellar cortex. These regions already activated by the grammatical classification further showed increased activity during affective classification whether the sentences included affective prosody or not.

The second set of areas located in the medial superior frontal gyrus (MF1) and at the left posterior ending of the STS (pSTS) was activated during affective classification, but they presented a negative BOLD signal variation during grammatical classification.

Comparison of the left and right hemisphere activations in the contrast [(AffAct–GrAct) {cap} (AffKali–GrKali)] using a whole-brain approach (thanks to the asymmetrical contrast maps) evidenced a significant leftward asymmetry in the pars triangularis/orbitaris of the IFG and in the pSTS, provided that no activity was detected in the right pSTS.

One should note that the reverse comparison (grammatical minus affective classification) did not reveal any differences at 0.001 uncorrected threshold for multiple comparisons.

Cerebral Network for the Processing of Affective Prosody

The areas involved in affective prosody processing were uncovered in the difference between brain activity during affective classification of sentences enounced by actors and during affective classification of sentences produced by Kali lacking affective prosody (AffAct–AffKali).

An activation located in the right anterior part of the STS (aSTS) passed the corrected threshold (0.05) (Fig. 5). When the threshold was lowered (0.001 uncorrected for multiple comparisons), this temporal activation spread to the bilateral anterior part of STG, including Heschl's gyri and to the right posterior part of STG (pSTG). At this threshold, the bilateral amygdalae, the putamen, and the hippocampal gyri showed a higher activity when affective prosody was present, as did motor areas, namely, the bilateral precentral gyri and right SMA (Table 3).


Figure 5
View larger version (26K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 5 The right temporal areas and affective prosody. The right aSTS areas (R aSTS) and the right pSTG (R pSTG) that were more activated during affective classification in the presence of affective prosody (sentences enounced by actors) than in the absence of affective prosody (sentences produced by Kali) are superimposed on a sagittal slice of the MNI-referenced brain (x = 56, P ≤ 0.001 uncorrected threshold). The cluster R aSTS, which shows a greater BOLD signal increase in presence of affective prosody than in other conditions, is located close to HSVA as defined in Figure 5 Belin and others (2000)Go, Figure 5 Belin and Zatorre (2003)Go, and Figure 5 Kriegstein and others (2003)Go. Bar charts provide the average BOLD signal variation during each condition compared with the beep detection reference task at the local maximal peak of activity (error bars correspond to standard error of the mean; peak coordinates are given in stereotaxic coordinates in mm; a.u., arbitrary unity; AffAct, red bar; AffKali, pink bar; GrAct, blue bar; GrKali, purple bar).

 

View this table:
[in this window]
[in a new window]

 
Table 3 Cortical areas implicated in affective prosody comprehension

 
Analysis of the mean BOLD signal values calculated for each of the 4 conditions in the local maximal peak of each cluster demonstrated that these areas presented different profiles. The right aSTS and pSTG activated during grammatical classification showed a further increase in activity when an affective semantic content was present and even more when the sentences included both an affective semantic content and affective prosody (Fig. 5). On the other hand, the frontal regions, the left temporal areas, and the amygdalae mainly exhibited a reduction or no increase in activity during the affective classification of affective sentences lacking affective prosody (Fig. 6).


Figure 6
View larger version (43K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 6 Areas showing decreased activity when affective semantic sentences lacked affective prosody. The clusters in bilateral amygdala, left heschl gyrus, and bilateral precentral gyrus, which were obtained in the contrast of affective classification in the presence of affective prosody versus in the absence of affective prosody are represented on axial slices of the MNI brain. Bar charts provide the average BOLD signal variation during each condition compared with the beep detection reference task at the local maximal peak of activity (error bars correspond to standard error of the mean; peaks coordinates are given in stereotaxic coordinates in mm; *P < 0.05, **P < 0.01 correspond to the results of one sample t-tests comparing the BOLD signal variation during each conditions with beep detection reference task; a.u., arbitrary unity; AffAct, red bar; AffKali, pink bar; GrAct, blue bar; GrKali, purple bar).

 
Hemispheric Lateralization of Temporal Areas Recruited by Affective Prosody

As stated in the Introduction, a key issue raised by previous neuropsychological and functional imaging literature concerns the right hemisphere dominance of temporal areas for prosodic processing. Based on the a priori hypothesis that prosodic temporal areas should exhibit a rightward lateralization during prosodic processing, we investigated the significant asymmetries in the AffAct contrast. Only considering the temporal lobe, significant rightward asymmetries were present in a lateral subpart of aSTS (x = 64, y = –6, z = –12, Z score = 4.30, extent = 29 voxels) and in an internal subpart of pSTG (x = 50, y = –38, z = 6, Z score = 4.14, extent = 101 voxels). To provide a detailed description of the behavior of these areas that were detected as asymmetrical, we calculated individually the BOLD signal variations in these clusters on each side (in contrast maps and flipped maps) and for each condition.

A repeated-measures ANOVA was then performed on these clusters BOLD values, entering Side (right vs. left hemisphere) and Speaker (Kali vs. Actor) as factors. During affective classification, a significant interaction between Side and Speaker, that is, effect of the presence of affective prosody, was observed (Fig. 7; aSTS: F = 5.6, P < 0.05; pSTG: F = 8.4, P < 0.01). This interaction was related to a larger BOLD increase in right than left areas when affective prosody was present. A main effect of hemisphere was observed, confirming the larger involvement of the right temporal areas during the affective classification whether affective prosody was present or not (aSTS: F = 4.7, P < 0.05; pSTG: F = 6.1, P < 0.05). A main effect of affective prosody was also found, showing that temporal areas were more involved when sentences included affective prosody than when they lacked it (aSTS: F = 39.7, P < 0.0001; pSTG: F = 72.8, P < 0.0001). Note that during the grammatical classification, the main effect of neither Side (aSTS: F = 0.7, P > 0.05; pSTG: F = 2.1, P > 0.05) nor Speaker (aSTS: F = 0.8, P > 0.05; pSTG: F = 0.8, P > 0.05) was observed.


Figure 7
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 7 Lateralization of temporal areas in presence of affective prosody. Variation of the BOLD signal during the 4 conditions in the clusters corresponding to the local maximal peaks in (A) the aSTS and (B) the pSTG (Figure 7 AffAct, Figure 7 gray line; AffKali, Figure 7 gray dotted line; GrAct, Figure 7 black line; GrKali, dark dotted line; LH, left hemisphere; RH, right hemisphere).

 

    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix A: Examples of...
 References
 
The present paradigm allowed to disentangle the areas involved in affective prosody from those involved in affective semantic and syntactic processing during affective sentence comprehension. First, the use of a reference condition involving the comprehension of sentences with a neutral emotional content allowed to pull out the areas dedicated to affective discourse comprehension independently of the presence of affective prosody. These areas were the left inferior frontal area, pSTS, and MF1. Interestingly, whereas IFG was already engaged during grammatical classification of neutral sentences, pSTS and MF1 were specifically involved when an emotional verbal material was present. Second, the use of a text-to-speech software that included grammatical but not affective prosody allowed to uncover areas involved in prosodic processing, in conditions with equivalent semantic content. These areas were located within the right temporal lobe and presented a rightward asymmetry, as could have been expected from studies of aprosodia. They corresponded to the human-selective voice area (HSVA) and the integrative posterior temporal cortex.

Network for the Grammatical Classification of Neutral Sentences

Although Kali, the text-to-speech software, sounded natural, we needed to check for its possible impact on neutral sentence comprehension. During the grammatical classification, we observed no differential effect on behavioral results of sentences with neutral verbal content produced by Kali compared with those uttered by actors, demonstrating the good intelligibility and correct grammatical prosody of this software compared with the natural stimuli. In the same vein, functional results did not show any difference during grammatical classification of sentences produced by Kali or enounced by actors. This is very likely because Kali built the speech stimuli from a database of naturally spoken syllables. One study on the impact of synthetic speech on neural activity found greater activity in the left premotor cortex during listening to natural speech than during listening to synthetic speech (Benson and others 2001Go), but these authors used synthetic stimuli that were not composed of natural tokens.

The grammatical classification of neutral sentences thus appears to be a relevant reference task to remove from the affective classification neural network: 1) IFG and STG activity related to sentence processing (Vigneau and others 2006Go), 2) right frontoparietal network engagement for attention, anticipation, and selection of the response (Tzourio and others 1997Go), and 3) activation of pre- and postcentral gyri that corresponded to the sensory-motor cortical representation of the hand (Mesulam 2000Go) activated by the motor response.

Network for Affective Semantic Comprehension

Semantic and Emotional Frontal Areas

A frontal network was recruited during affective classification of sentences with affective semantic content and to a lesser extent during grammatical classification of neutral sentences. Although homologous rightward activity was present, the significant leftward asymmetry of this network attested its language specificity. These areas were located in the anterior and inferior part of the left IFG, known to be involved in semantic categorization (Poldrack and others 1999Go; Adams and Janata 2002Go) and selection of semantic knowledge (Wagner and others 2001Go; Booth and others 2002Go). They were easy to relate to the strategy, reported by all subjects, of relying on semantic cues to classify sentences with affective verbal content. In addition, subjects reported to mentally rehearse the sentences, a strategy most likely corresponding to the observed activations in pre-SMA and the left anterior insula known to be involved in speech mental articulation (Ackermann and Riecker 2004Go).

The present IFG clusters located in the pars orbitaris overlapped the areas activated during the emotional discrimination of sentences compared with the repetition of the last word of these sentences (George and others 1996Go). They also overlapped in studies comparing the judgment of emotional expressiveness with the discrimination of grammatical prosodic accentuation (Wildgruber and others 2004Go) or comparing emotional discrimination with verbalization of a target vowel (Wildgruber and others 2005Go). This orbitofrontal area is also activated in the current work by the processing of sentences with affective verbal content independently of the presence of affective prosody, in line with earlier suggestions of the role of this region in emotional processing (Wildgruber and others 2004Go, 2005Go). It also agrees with the report of activation of the pars orbitaris during the perception of emotional words (Beauregard and others 1997Go) or gender discrimination operating on an emotional face (Blair and others 1999Go).

Medial Prefrontal and Left pSTS Activations: Inference of the Speaker's Mental State

A second set of regions, namely, the MF1 and the left pSTS, showed increased activity when subjects performed the affective classification whereas they were not activated during the grammatical classification.

Involvement of the medial wall of the frontal lobe could be related to error detection (Botvinick and others 2004Go). As a matter of fact in the present study, behavioral results showed a larger number of errors during the affective classification than during the grammatical classification, a difference that could be related to this higher MF1 activity during the affective classification task. However, this hypothesis is challenged by the numerous reports that located the region sensitive to error detection in the anterior part of the cingulated gyrus, in a lower location than the cluster of the present study (for reviews, see Bush and others 2000Go; Ridderinkhof and others 2004Go; Rushworth and others 2004Go).

Actually, numerous studies on theory of mind (TOM) processing intersected in MF1 activation found in the present study, as shown in Figure 8A (for methods, see Jobard and others 2003Go). The expression TOM refers to the ability to explain and predict one's own actions and those of other intelligent agents (Premack and Woodruff 1978Go). The tasks used in these previous TOM studies involved either verbal (Vogeley and others 2001Go; Harris and others 2005Go) or visual material (films [Castelli and others 2000Go], cartoons [Brunet and others 2000Go; Gallagher and others 2000Go; Walter and others 2004Go], or objects [Goel and others 1995Go]) and included the inference of another's mental state, such as the attribution of intention (Castelli and others 2000Go; Walter and others 2004Go; Harris and others 2005Go) and the observation of social interactions (Iacoboni and others 2004Go). It is also involved when one has to evaluate his/her own mental state (Craik and others 1999Go; Ruby and Decety 2003Go; Sugiura and others 2004Go; den Ouden and others 2005Go; Johnson and others 2005Go; Ochsner and others 2005Go; Schmitz and Johnson 2005Go) or his/her own emotional state (Reiman and others 1997Go; Ochsner and others 2002Go) (Fig. 8A). Emotional content of the stimuli appears crucial because MF1 is activated by the perception of empathic situations, when one had to infer and share the emotional experiences of others (Lawrence and others 2006Go; Mitchell, Banaji, and Macrae 2005Go; Mitchell and others 2005aGo, 2005bGo; Hynes and others 2006Go; Vollm and others 2006Go). Upper MF1 involvement during other emotional processing than empathy is seldom: in a review conducted by Phan and others (2002)Go, it is the lower part of MF1 that is targeted by emotional processes, only few peaks of activation elicited by the perception of facial emotions (Blair and others 1999Go), or emotional words (Beauregard) overlapped with the part of MF1 activated in the present study (Fig. 8A). Thus, the upper MF1, activated during affective classification, is very likely involved in the representation of internal mental states—whether it is self-reflection (Northoff and Bermpohl 2004Go) or other's mental state that has to be inferred (Gallagher and Frith 2003Go)—a neural activity that appears to be enhanced by the emotional content of the stimuli to process (Gallagher and Frith 2003Go).


Figure 8
View larger version (62K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 8 Meta-analysis in the medial frontal gyrus and pSTS: activations related to the processing of affective sentences are superimposed on the internal surface and sagittal slice of MNI single subject. Peaks issued from studies dealing with TOM (squares), self (triangles), emotion (pink and purple circles), and syntactic processing (green circles) are represented. (A) Projection on the medial surface and (B) on the sagittal slice (x = –50) of the MNI-referenced brain template of 1) activation detected with a conjunction analysis of the affective minus grammatical classification of sentences enounced by actors and produced by Kali (from red to yellow, P ≤ 0.0025 corrected threshold for multiple comparisons); peaks of activation coming from 2) studies on TOM processing such as judging intentionality (Figure 8 Brunet and others 2000Go; Figure 8 Castelli and others 2000Go; Figure 8 Iacoboni and others 2004Go; Figure 8 Walter and others 2004Go; Figure 8 Harris and others 2005Go), comprehension of TOM stories (Figure 8 Fletcher and others 1995Go; Figure 8 Gallagher and others 2000Go; Figure 8 Vogeley and others 2001Go; Figure 8 Ferstl and Von Cramon 2002Go; Figure 8 Saxe and Kanwisher 2003Go), and judging other knowledge (Figure 8 Goel and others 1995Go; Figure 8 Ruby 2004), 3) studies on self-reflection such as self preference's judgment (Figure 8 Craik and others 1999Go; Figure 8 Suguira and others 2004; Figure 8 Johnson and others 2005Go; Figure 8 Ochsner and others 2005Go; Figure 8Schmitz and Johnson 2005Go), self knowledge's evaluation (Figure 8 Ruby and Decety 2003Go; Figure 8 den Ouden and others 2005Go), emphatic situations (Figure 8 Lawrence and others 2006Go; Figure 8 Mitchell, Banaji, and Macrae 2005Go; Mitchell and others 2005aGo, 2005bGo; Figure 8 Hynes and others 2006Go; Figure 8 Vollm and others 2006Go), self-evaluation of emotional content (Figure 8 Reiman and others 1997Go; Figure 8 Ochsner and others 2002Go;), 4) studies on emotional processing: comparing the processing of emotional with neutral words (Figure 8 Beauregard and others 1997Go) or faces (Figure 8 Blair and others 1999Go), or 5) studies on the integration of semantic and syntactic processing at the level of sentences (Figure 8 Embick and others 2000Go; Kuperberg and others 2000Go; Kircher and others 2001Go; Luke and others 2002Go) or texts (Goel and others 1998Go; Homae and others 2002Go). All peaks of activations were placed in the MNI stereotaxic space (for methods, see Jobard and others 2003Go).

 
Concerning the left pSTS area activation, it is likely to be related to the integration of semantic and syntactic processing, crucial to succeed affective sentences classification, whereas useless to perform the grammatical classification. As a matter of fact, together with IFG they constitute a network for semantic analysis (Vigneau and others 2006Go). As illustrated in Figure 8B, this leftward lateralized area overlaps with peaks elicited by sentence-processing tasks that necessitate a semantic integration: judgment on grammatical errors compared with pronunciation errors (Embick and others 2000Go), generation of the final word of a sentence, (Kircher and others 2001Go), and comprehension of coherent rather than incoherent sentences (Kuperberg and others 2000Go; Luke and others 2002Go). Such a role in the semantic integration of complex verbal material is not limited to sentences; this area is also involved during text comprehension, with increased activation when sentences constitute a dialog (Homae and others 2002Go) or compose a syllogism (Goel and others 1998Go) than when they are not linked.

Interestingly, one should note that this role of the left pSTS in text integration includes a specific involvement during the comprehension of TOM stories. If some authors have proven that TOM stories engaged left pSTS more than unlinked sentences, confirming its role in semantic integration of complex material (Fletcher and others 1995Go; Ferstl and Von Cramon 2002Go), others have demonstrated an additional increase in activity of this region when they compared TOM stories with syntactically correct stories that described non-TOM events (Gallagher and others 2000Go; Saxe and Kanwisher 2003Go). This region is also engaged when one has to interpret others' intentions (Castelli and others 2000Go; Walter and others 2004Go), as well as when the representation of the self is needed such as during self-evaluation (Ruby and Decety 2003Go; den Ouden and others 2005Go; Johnson and others 2005Go), the processing of empathic situations (Hynes and others 2006Go; Vollm and others 2006Go), or valence assessment of emotional film (Lane and others 1997Go; Reiman and others 1997Go).

These observations conduct us to hypothesize that the role of the left pSTS, in the present study, cannot be restricted to the processing of the sentences propositional content. We instead postulate that the left pSTS would integrate the semantic and emotional content of speech to interpret the intended meaning of the speaker. The fact that the left pSTS together with MF1 were described as part of the core system for TOM processing (Gallagher and Frith 2003Go) suggests that activation of this network could be related to the computation of the speaker's mental state during affective classification. But considering the fact that activations in the upper MF1 and pSTS were also elicited by tasks relying on the self-evaluation of feelings or emotions (Fig. 8), subjects may as well have based their evaluation on a reflection about their own emotional state.

Note that this involvement of the MF1 or pSTS areas was independent of the presence of affective prosody because there was no observable modification in their activity when a lack of affective prosody increased the difficulty of the affective classification. Subjects' performances were relatively accurate in this condition (70% CA) leading to the conclusion that the possible call for TOM processing in emotional speech comprehension would be triggered by the affective semantic message rather than by the affective prosodic content of the sentences.

The Role of the Right Temporal Areas in Affective Prosody Processing

Although they were not explicitly informed of the presence of affective prosody, all subjects reported to rely on intonation to solve the affective classification task in the presence of affective prosody. Their greater speed and accuracy during this task led to the conclusion that optimization of affective discourse comprehension by the presence of affective prosody is supported by 2 areas in the right temporal lobe.

The presence of affective prosody led to the activation of the right aSTS that closely matches the so-called HSVA (Belin and others 2004Go). The HSVA was defined as a bilateral region that responds more to the human vocal sounds than to the environmental sounds (Belin and others 2000Go; Kriegstein and others 2003Go) or to the vocalization of other species (e.g., monkeys) (Fecteau and others 2004bGo), and its activity increases even more when several speakers are heard (Belin and Zatorre 2003Go) (Fig. 5). In line with the present result, these findings imply the involvement of the right HSVA in the processing of affective prosody, a human-specific acoustical feature.

More precisely, the right HSVA is implicated in the treatment of the paralinguistic features of voice that allow identification of speaker's gender (Fecteau and others 2004aGo). This paralinguistic function was confirmed by Grandjean and others (2005)Go, who identified a rightward asymmetry of HSVA when subjects had to detect the gender of the speaker during presentation of pseudosentences to the left ear. Based on the present results, we hypothesized that the right HSVA computed the emotional content of voice through the extraction of slow acoustical elements that characterized affective prosody. Indeed, this process had been previously evidenced as a right lateral temporal lobe expertise (Belin and others 1998Go; Griffiths and others 1998Go; Meyer and others 2002Go; Mitchell and others 2003Go; Wildgruber and others 2005Go).

The right pSTG was the second area that showed greater activity in presence of affective prosody in this study. This result recalls Ross's model on neural correlates of affective prosody: from observations of aprosodic patients, he postulated that the rightward cortical organization for affective prosodic comprehension parallels the leftward organization of propositional language (Ross 1981Go). Indeed, right pSTG can be considered as homologous to Wernicke's area. Thus, we hypothesize that the right pSTG would perform the first interpretation in terms of emotional labeling of the relevant prosodic features extracted in the right HSVA. This information would be further integrated with the linguistic information computed in the left homologue via transcallosal transfer in order to complete sentence comprehension (Ross and others 1997Go).

The significant rightward asymmetry in HSVA and pSTG observed in the present study allows reconciliation of both neuropsychological and functional views: it shows that affective prosody processing led to bilateral but rightward asymmetrical activation in the temporal areas essential for affective prosodic comprehension (Ross 1981Go). This finding reinforces the assumption that, in functional studies of affective prosody, additional leftward semantic resources were engaged to try to catch the meaning of filtered sentences (Meyer and others 2002Go; Kotz and others 2003Go) or sentences constructed with pseudowords (Price and others 1996Go; Grandjean and others 2005Go).

Reduction of Activity in the Audio-Motor Loop and the Amygdalae when Prosody Is Incongruent with Semantic Affective Content

Like the temporal areas, the amygdalae, precentral gyri, and left Heschl's gyrus exhibited greater activation during affective classification performed on sentences containing affective prosody than when this task was performed on sentences spoken without affective prosody. But as opposed to temporal regions, they were identified because of a decrease in activity during the affective classification task in the absence of affective prosody, rather than because they were activated by the presence of affective prosody (same amount of activity as during grammatical classification). This decrease was not related to the use of Kali, which had no impact on the neural activity of these areas during the grammatical classification (Fig. 6). Rather, this decrease appeared related to the fact that when Kali produced affective sentences, their prosodic and affective verbal content were not congruent. As a matter of fact, these Kali-produced sentences contained only grammatical prosody that has a neutral valence on affective scaling incongruent with the sentences' strong affective semantic content. This decrease in activity can be interpreted as the suppression of the incongruent prosodic processing that interfered with the comprehension of affective sentences. Indeed, decreases in BOLD signal can be considered as indicators of reduced input and local computation in the cortical areas (Logothetis and others 2001Go).

In the present case, the suppression at work when the prosodic and affective messages were not congruent targeted 2 systems. First was the processing of the affective content carried by the voice in the amygdalae (for a review, Adolphs 2002Go). The amygdala appears to be involved in emotional processing of voice as Scott and coworkers identified a deficit of emotional prosodic comprehension following lesions of the amygdalae (Scott and others 1997Go). Its decrease in activity would thus suggest the intervention of a filtering process that reduced the emotional processing of the inadequate prosody. The second system, made up of the left Heschl and precentral gyrus, composes the audio-motor loop described by Hickok and Poeppel (2000)Go that enters into the processing of speech comprehension through an audio-motor simulation (Liberman and Whalen 2000Go). In the present case, the simulation of the speech that includes discordant prosody is very likely to be attenuated, possibly to allow the subjects to generate a more adequate prosody through mental imagery (Pihan and others 1997Go), as has been reported by some of them.

General Conclusion

This study allowed to disentangle the networks involved in semantic and prosodic processing of emotional discourse. It reconciles views issued from neuropsychology of aprosodia and functional imaging reports by confirming that the right temporal lobe is essential for emotional prosody processing and presents a rightward lateralization. Indeed, it is the right HSVA, together with the pSTG that process the emotional prosody. In addition, the use of sentences with equivalent syntactic and semantic content allowed to demonstrate that the involvement of the pars orbitaris of the right IFG was not linked with the presence of emotional prosody per se but rather with the presence of emotional words.

The present results open a new perspective: specific to emotional discourse is the activation of systems leading to the configuration of brain activity toward human social interactions. First, this is visible in the identification of HSVA as the region that processes emotional prosody. As a matter of fact, the role of HSVA in the right hemisphere can be expanded to the processing of social interactions charged with emotion. Right-damaged patients not only present a deficit of affective prosody comprehension (Ross 1981Go) but also exhibit a