Skip Navigation


Cerebral Cortex Advance Access originally published online on July 27, 2005
Cerebral Cortex 2006 16(4):587-595; doi:10.1093/cercor/bhj006
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
16/4/587    most recent
bhj006v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (14)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kemeny, S.
Right arrow Articles by Braun, A. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kemeny, S.
Right arrow Articles by Braun, A. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Published by Oxford University Press 2005.

Temporal Dissociation of Early Lexical Access and Articulation Using a Delayed Naming Task — An fMRI Study

Stefan Kemeny, Jiang Xu, Grace H. Park, Lara A. Hosey, Carla M. Wettig and Allen R. Braun

Language Section, Voice, Speech and Language Branch, National Institute on Deafness and other Communication Disorders, NIDCD, National Institutes of Health, Bethesda, MD 20892, USA

Address correspondence to Stefan Kemeny, NIH Clinical Center, 9000 Rockville Pike Bld, 10/Rm 3C-716, Bethesda, MD 20892, USA. Email: kemenys{at}nidcd.nih.gov.


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
Neuroimaging studies of overt speech hold an important practical advantage allowing monitoring of subject performance, particularly valuable in disorders like aphasia. However, speech production is not a monotonic process but a complex sequence of stages. Levelt and colleagues have described these as roughly corresponding to two originally independent systems — conceptual and sensorimotor — that are linked in the formulation and expression of spoken language. In the initial stages a word is chosen to match a concept (lexical selection); in the later stages the sound and motor patterns are encoded and the word is uttered (articulation). It has been difficult to discriminate these stages using conventional neuroimaging techniques. We designed a functional magnetic resonance imaging study in an attempt to do this, by introducing a latency into a conventional naming paradigm, delaying the articulated response. Our results showed that left hemisphere perisylvian areas were active throughout, interacting with visual and heteromodal areas during early lexical access and with motor and auditory areas during overt articulation. These results are consistent with the broadest version of the Levelt model and with that derived from Chomsky's minimalist program in which a core language system interacts with conceptual-intentional systems and articulatory-perceptual systems during the early and late stages of lexical access respectively.

Key Words: brain function • human brain mapping • magnetic resonance imaging • production • speech


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
In recent years, there has been a growing trend towards the use of overt speech paradigms in neuroimaging studies of language production. Historically, covert paradigms were preferred for blood oxygen level-dependent (BOLD) functional magnetic resonance imaging (fMRI) studies, because of movement and susceptibility artifacts generated during overt articulation. However, recent studies have shown that, for the production of brief utterances, the signal increase derived from the activation can be differentiated from these artifacts due to the hemodynamic characteristics of the BOLD response (Birn et al., 1999Go; Palmer et al., 2001Go).

The ability to study overt as opposed to covert speech holds important pragmatic advantages: it allows the researcher to directly observe subjects' responses and will ultimately make it possible to monitor individual performance measures like accuracy or response times that can be used as covariates of interest in the data analysis. Barch et al. (1999)Go showed that task compliance and performance need to be monitored even in studies with normal volunteers. This will be of even greater importance in studies of patients with speech-language disorders. In aphasic patients, for instance, only direct monitoring of overt speech production will make it possible to quantify performance measures and record errors such as semantic or phonemic paraphasias.

Studying overt speech is preferable from a theoretical perspective as well. Speech production is not a monotonic process but a complex sequence of steps ranging from the selection of a word to express an idea, to the point at which its phonetic form is encoded and the word is pronounced. These stages are essential components of an integrated process, a complete understanding of which is really possible only by studying the entire sequence of events from early lexical access through overt articulation.

The temporal order of these stages, and the ways in which they interact (and the possibility of dissociating them experimentally) have been theoretically specified. One of the widely accepted models of word production is that proposed by Levelt (2001)Go. The model, in its original form, describes word production as a serial two-system architecture (Levelt et al., 1999Go), roughly corresponding to two systems — conceptual and sensorimotor — that were originally independent but have become linked together in the formulation and expression of spoken language.

In the first stage the conceptual features of the utterance are specified. The speaker identifies a concept (‘conceptual preparation’) for which a target word or words exist. This activates a set of corresponding items within the mental lexicon. Selection of the most appropriate of these is termed ‘lexical selection’.

In the next stage, the sound and motor patterns of the word are specified. The word's phonological code is retrieved, providing a template for the motor program with which the speaker is able to generate the correct sequence of movements (‘phonetic encoding’), in order to articulate the target word (and self-monitor the utterance).

The more fully elaborated models developed by Levelt and colleagues (LRM or Weaver++) specify more discrete stages that are beyond the level of resolution of most contemporary neuroimaging techniques. It is, however, the broader, coarser-grained features of the model as originally conceived that have motivated the present study.

A similar, but simpler model, has emerged from Chomsky's contemporary minimalist program (Chomsky, 1995Go). According to this model, the language faculty consists of a cognitive component, a core computational system that performs the essential function of language, coupling sound and meaning. The use of language in the world — the performance component — involves the interaction of this core system with external systems that are relevant to language use, at two ‘interface levels’, one related to sound, the other to meaning. These — paralleling Levelt's two-tiered architecture — are the articulatory-perceptual and conceptual-intentional systems; taken together all of these systems constitute the faculty of language in the broad sense (Hauser et al., 2002Go).

The process of naming should begin with the interaction of the core language system and conceptual-intentional systems (corresponding to the earliest stages of lexical access) and terminate with the interaction of language and perceptual-articulatory systems (corresponding to the late stages of articulation and self-monitoring).

Whatever the model, a neuroimaging paradigm that could effectively distinguish early and late stages of word production — differentiating lexical selection and articulation — would be useful, particularly in clinical populations such as Broca's aphasia, that can be characterized by deficits in both early and later stages of lexical access. Disambiguating these stages and independently evaluating the neural correlates of each would be particularly informative in the course of language recovery.

While electrophysiological methods (Abdullaev and Posner, 1998Go) have been used to demonstrate some of the temporal features of word production, the spatial resolution of these methods is poor. Hemodynamic neuroimaging, although not capable of the same temporal precision, would be able to address the issue with superior spatial resolution.

In a large-scale meta-analysis, Indefrey and colleagues evaluated the results of a number of different neuroimaging studies of speech production in order to associate regional activations with the various stages of the word production model (Indefrey and Levelt, 2004Go). Thus far, however, no study has attempted to differentiate the early and late stages of lexical access in a single experiment.

In the present study, we attempted to separate lexical selection and articulation by introducing a latency in the naming process and capitalizing on the temporal features of rapid event-related fMRI. We used a traditional confrontational naming paradigm in order to do this. A series of visual scenes depicting transitive actions were presented to subjects who were asked to overtly name the action or the object, in separate runs, after the appearance of a delayed cue, ~3 s after stimulus onset.

We used contrast analyses to characterize activations that were selectively associated with the early and late stages of lexical access respectively and a conjunction analysis to identify regions in which activations were detected in both.

As noted above, within these somewhat coarse temporal constraints, it may be impossible to precisely match activations with the individual sub-stages specified in Levelt's more fully detailed models. Indeed, it may be impossible to effectively interrupt the cascade of events involved in confrontational naming by imposing a delay in the process. Nevertheless, we expected there to be an activation bias when the broader stages of the model — lexical selection and articulation — are segregated in this fashion.

We hypothesized that areas involved in the earliest stages of lexical access (as the language system interacts with conceptual-intentional systems) would be activated during the initial presentation of the picture and should include visual association cortices (which process visual form and extract a lexical concept from the picture) and heteromodal areas that support matching the concept with a lexical item. In contrast, motor, auditory and other association areas involved in phonetic coding, overt articulation and self-monitoring would be selectively activated following the presentation of the response cue (as the language system interacts with articulatory-perceptual systems). Regions that play a role in core linguistic processes, probably encompassing perisylvian areas of the left hemisphere, would be active in both stages, and should therefore be present in the conjunctions.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
Subjects

Nineteen male subjects (mean age 32 ± 12 years) participated in the experiment. All subjects were right handed, with English as their native language and all were without a history of neurological or psychiatric disorders. All subjects were tested prior to the experiments using a combination of standardized test batteries that evaluated language [Boston Naming Test (Kaplan et al., 2001Go), Multilingual Aphasia Examination (Benton et al., 1978Go)], working memory [Doors and People Test (Baddeley et al., 1994Go)] and general cognitive performance [RBANS (Randolph, 1998Go)]. Subjects' individual results were within normal ranges on each test. Informed consent was obtained from all subjects prior to participation. The study was conducted in accordance with the Declaration of Helsinki, following approval by the NINDS/NIDCD Institutional Review Board.

Stimuli and Design

Subjects underwent training prior to scanning, during which they viewed a series of photographs depicting transitive actions (different from those used during the scanning session), and were instructed to overtly name the action or the object in the image. To separate lexical selection and articulation, subjects were instructed to produce the target word overtly only after seeing a visual cue, presented on average 3 s after presentation of the photograph. Subjects produced verbs or nouns in separate runs and were reminded before each run what to produce. The sequence of the runs was randomized between all subjects to avoid order effects.

One hundred and fifty-eight stimuli were chosen from a larger set based on word frequency and familiarity. Photographs were digitized and the contrast and luminance levels were adjusted for optimal resolution and visibility. The target verbs and nouns described commonly used actions, e.g. ‘throw ball’, ‘light candle’, etc. (Fig. 1).


Figure 1
View larger version (74K):
[in this window]
[in a new window]
 
Figure 1. Example stimulus: The picture depicts a transitive action used as visual stimulus in our experiment. The expected target verb in this case is ring, the expected object is bell.

 
We obtained reaction times using the same cued paradigm outside the scanner in a group of 10 normal subjects. Reaction times were recorded and measured following the onset of cue using the program Presentation® (Neurobehavioral Systems). Reaction times for verbs were 365.3 ± 148.3 ms (mean ± SD), range = 280.4–495.2; reaction times for nouns were 370.3 ± 155.2 ms, range = 269.2–591.7.

The final set of 158 stimuli was randomly divided into two equal groups, one for naming verbs and one for nouns, matched for word frequency and familiarity. We used an event-related paradigm in which the mean ISI, between presentation of the transitive action photographs was 8 s. Stimuli were presented for 500 ms. Following this — on average 3 s (jittered ±500 ms) after presentation of the photograph — a visual cue (a black and white square in the center of fixation, 300 ms in duration) was presented to indicate that the subject should overtly name the action or the object in the picture. Seventy-nine stimuli were presented in each run; stimuli were not repeated within or between runs. The total duration for each functional run was 10 min 56 s.

Data Acquisition

BOLD functional images were acquired with a 3 T whole-body scanner (GE Signa, General Electric, Milwaukee, WI) using a standard quadrature head coil and a gradient-echo EPI sequence. The scan parameters were as follows: TR = 2000 ms, TE = 30 ms, flip angle = 90°, 64 x 64 matrix, field of view 220 mm, 21 parallel axial slices covering the whole brain, 6 mm thickness. Four initial dummy scans were acquired during the establishment of equilibrium and discarded in the data analysis. Each run comprised 328 volumes, and two runs were scanned per subject. In addition to the functional data, high-resolution structural images were obtained using a standard clinical T1-weighted sequence. The subjects lay supine in the scanner, their heads secured with a padded strap placed across the forehead and secured to the sides of the headcoil, without further mechanical restraint. The fit was tight but not uncomfortable for the subjects.

Stimulus presentation was achieved via projection of the stimulus using a laptop running Presentation® software (Neurobehavioral Systems) on a matte screen; subjects viewed the screen through a headcoil-mounted mirror, the visual angle was 7.6° vertically and 10.2° horizontally.

Data Processing

Data analysis was performed using statistical parametric mapping (SPM 99, Wellcome Department of Cognitive Neurology, London, UK), implemented in Matlab (Mathworks Inc., Sherborn, MA). Functional runs for each subject were slice-timing corrected to compensate for acquisition delays, and motion corrected by realignment using the first volume in each run as a reference. The motion parameters were evaluated for excessive head movement. The resulting mean EPI image was normalized into canonical (Talairach) space using the Montreal Neurological Institute (MNI) template. The calculated transformation matrix was applied to the individual EPI images. Functional images were smoothed with an isotropic Gaussian filter of 9 mm.

For each run, precise event timings were used to define two principal conditions. Presentation (onset) of photographs depicting transitive actions defined the retrieval condition. Onset of the delayed stimulus (black and white square) that cued the overt naming response defined the articulation condition.

SPM was used to analyze the following contrasts:

  1. Retrieval minus articulation’, the difference in activation between these conditions identified activity unique to (or significantly greater in magnitude during) the retrieval condition.
  2. Articulation minus retrieval’, similarly identified activity unique to (or significantly greater in magnitude during) articulation.
  3. A conjunction analysis (Friston et al., 1999Go) was used to identify brain areas that were significantly active during both retrieval and articulation conditions.
Contrasts 1 and 2 were calculated using a random-effects group-analysis; results show t-values with a significance threshold of P < 0.001 (uncorrected). The conjunction was calculated in a fixed-effect group-analysis, results showing t-values with a significance threshold of P < 0.0001 (corrected).


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
The estimated head-movement parameters from the realignment step showed no excessive movement; in particular, no strong task-correlated movement was detected during articulation. The maximal movement effect for translation was <0.6 mm, and maximal rotation was <0.5°.

The first part of our analysis focused on differences between retrieval and articulation events. Direct SPM contrasts were calculated using the data from both functional runs.

The contrast ‘retrieval minus articulation’ revealed significantly greater activity in primary visual and visual association areas, in prefrontal, premotor, parietal and cingulate cortices and in subcortical structures (Fig. 2; Table 1).


Figure 2
View larger version (29K):
[in this window]
[in a new window]
 
Figure 2. Retrieval–articulation: the statistical parametric map for this contrast shows activation unique for the retrieval condition. The random-effects group-analysis results (n = 19) are thresholded at P < 0.001, uncorrected. t-scores are indicated in the color bar. Results are overlaid on selected slices of a MNI normalized T1 MRI template; the numbers in the left corner show the MNI z-coordinate of each slice. Left hemisphere is represented on the left side of the figure. Activation is primarily found in the visual occipitotemporal cortices, and in heteromodal areas of the parietal and frontal lobes (see Table 1).

 

View this table:
[in this window]
[in a new window]
 
Table 1 Retrieval–articulation: random-effects group-analysis (n = 19) showing the anatomical localizations and corresponding t-scores (P < 0.001 uncorrected) for this contrast. Activation is primarily found in the visual occipitotemporal cortices, and also in the parietal lobe and frontally

 
The strongest activation was seen in the calcarine cortex (BA 17), with additional foci in the lingual gyri (BA 18) and the lateral occipital cortex (BA 19). The fusiform gyrus (BA 37) was activated bilaterally along its entire extent. Temporal lobe activations were also present and included the inferior temporal gyrus (ITG, BA 20).

The inferior frontal gyrus (BA 44) showed significantly elevated activity, as did the dorsal lateral premotor cortex (BA 6), and the pre-SMA. Activity in the medial hemispheres of the cerebellum and in the lateral geniculate of the thalamus was also greater during retrieval.

We also observed activation of heteromodal cortices at the ventral temporo-parieto-occipital (TPO) junction (BA 39,19). Activated parietal areas included the precuneus (BA 7), superior parietal lobule (SPL, BA 7) and intraparietal sulcus (IPS). Activations were also detected in the posterior cingulate cortex (BA 23,31). All activations with the exception of the precuneus were bilateral.

The contrast ‘articulation minus retrieval’ showed significantly greater activity in frontal and temporal areas associated with motor control and auditory perception respectively, and in subcortical regions (Fig. 3; Table 2).


Figure 3
View larger version (27K):
[in this window]
[in a new window]
 
Figure 3. Articulation–retrieval: the statistical parametric map for this contrast shows activation unique for the articulation condition. Formatting and thresholding are as described in Figure 2. Activation is found primarily in frontal and temporal regions generally associated with motor control and auditory perception (see Table 2).

 

View this table:
[in this window]
[in a new window]
 
Table 2 Articulation – retrieval: random-effects analysis as described in Table 1, showing the localizations and corresponding t-scores for the contrast articulation–retrieval. Activation is mostly in frontal and temporal regions generally associated with motor control and auditory perception

 
The precentral gyrus (BA 4,6) was selectively activated bilaterally, with distinct local maxima in its ventral, medial and dorsal portions. This contrast also revealed activation in the SMA proper (BA 6), the right ventral lateral premotor cortex (BA 44,6) and in a widespread extent of the insula, encompassing more anterior portions in the right, posterior portions in the left hemispheres respectively. Both ventral (BA 32) and dorsal (BA 24,32) portions of the anterior cingulate cortex (ACC) were activated bilaterally. In the parietal lobe, bilateral foci were located in the SPL (BA 7, dorsal to the area identified in the ‘retrieval minus activation’ contrast).

In the temporal lobe bilateral activation of the transverse temporal (TTG, BA 41), and anterior superior temporal gyri (STG, BA 22), extended into the superior temporal sulcus (STS) in both hemispheres. Activation of the posterior STG (BA 22) was also observed, in this case strongly lateralized to the left.

Bilateral clusters of activation were detected in the cerebellum, in the hemispheres (medial to the regions identified in the retrieval minus activation’ contrast) and in the vermis. Bilateral activation of the posterior putamen, the ventral thalamus and the dorsomedial anterior thalamus was also observed.

While the preceding analyses identified activations that were selectively greater during either retrieval or articulation, the conjunction analysis identified activations common to both conditions (Fig. 4; Table 3).


Figure 4
View larger version (44K):
[in this window]
[in a new window]
 
Figure 4. Conjunction of articulation and retrieval: The results show brain areas that are activated in both conditions; only areas activated in both tasks are shown. The fixed-effect group result is thresholded at P < 0.0001, corrected. Results are rendered on the MNI standard template with SPM. Shared activations include the classical perisylvian language areas, lateralized to the left hemisphere (see Table 3), additionally bilateral STS/MTG and precentral gyri.

 

View this table:
[in this window]
[in a new window]
 
Table 3 Conjunction of articulation and retrieval: conjunction analysis group result (fixed-effect), t-scores thresholded at P < 0.0001 (corrected)

 
The shared activations principally included classical left hemisphere language areas. These were found in anterior perisylvian cortices, including the left frontal operculum (pars triangularis, BA 45, and pars orbitalis, BA 47), and extending into the midportion of the left insula. Conjunctions included posterior perisylvian cortices as well and included left anterior middle temporal (MTG, BA 21), supramarginal (SMG, BA 40) and angular gyrus (BA 39); bilateral activations were detected in the posterior MTG, extending into the superior temporal sulcus (STS, BA 21,22).

Common activations were also detected in the precentral gyri (BA 4), the medial cerebellar hemispheres, posterior putamen and ventral portions of the thalamus (in regions distinct from those identified in the preceding contrasts).


    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
This study was designed to capitalize on the temporal resolution of event-related fMRI in an attempt to separate and independently evaluate the essential components of speech production. In order to do this we used overt speech to examine the entire cascade of events, from early lexical access through the execution of articulatory movements.

Our work was guided by what has become a standard contemporary model for word production, that proposed by Levelt and co-workers (Levelt, 1989Go, 2001Go). While the more detailed versions of this model — e.g. LRM, WEAVER++ — denote a series of discrete stages that are beyond the level of temporal resolution obtained with event-related fMRI, we felt that the broader, coarser-grained features of the model as originally conceived could be differentiated by this method.

In his original description, Levelt characterizes the process of word production as a serial two-system architecture corresponding to the operation of two broad, originally independent systems — conceptual and sensorimotor — that are engaged in the formulation and expression of spoken language. This paradigm is theoretically similar to one that emerges from Chomsky's contemporary minimalist program (Chomsky, 1995Go).

The earliest stages as defined by Levelt begin with ‘conceptual preparation’ as a speaker chooses the concept that he or she wishes to express. If the business of language is to provide a link between sound and meaning, this is the level at which meaning is encoded. In Chomsky's model, this is the point at which the language system interacts with more general conceptual-intentional systems. In the present experiment the lexical concept is represented — as the action or the object — in the visual image presented to the subject. The next step, ‘lexical selection’, takes place as the speaker selects a lexical item, a word to match this concept, from the mental lexicon. Responses that are most tightly coupled to the processing of the picture (detected in the retrievalarticulation contrast) should reflect these early stages.

Activations that are strongest following the response cue (detected in the articulationretrieval contrast) should more likely reflect the later stages, which Levelt terms ‘form encoding’. It is during these terminal stages that the phonological form of the word is translated into the associated motor patterns with which the speaker articulates the chosen item. The speaker then monitors his or her output. In Chomsky's model, this is the point at which the language system interacts with sensorimotor — articulatory and perceptual — systems.

The contrasts specified above should then differentiate the earliest stages of lexical access and the later stages of phonetic encoding, articulation and self-monitoring, highlighting regions that are selectively associated with these stages.

The shared features — activations that are present during both retrieval and articulation phases — are indexed by the conjunctions, and should represent regions that support linguistic processes during both the early and late stages of lexical access. These regions may play a role in more than one essential linguistic computation. In Chomsky's parlance the conjunctions should identify the core language system that links the conceptual-intentional and perceptual-articulatory domains.

Conjunctions – Identification of a Core Language System

The conjunction highlighted perisylvian cortices bilaterally, but activations were strongly lateralized to left. In the anterior perisylvian regions, these included activation of the inferior frontal gyrus (ventral operculum, BA 45, 47) and contiguous portions of the insula. In posterior perisylvian areas, these included left lateralized activation of the MTG extending into posterior STS (BA 21,22) as well as activation of the left inferior parietal lobule (IPL, BA 39,40) extending into the TPO junction.

These regions are traditionally considered to represent the ‘core’ language system and as such have been shown by neuroimaging studies to participate in a wide variety of linguistic tasks ranging from phonological to syntactic to semantic processing (Dronkers, 1996Go; Wise et al., 1999Go; Cabeza and Nyberg, 2000Go; Bookheimer, 2002Go; Heim and Friederici, 2003Go). In the meta-analysis of Indefrey and Levelt (2004)Go these areas have been associated with many of the core components of word production: syllabification, phonological code retrieval, articulatory planning.

It should be noted that a subset of these regions — supramarginal gyrus, and frontal operculum — also appear to play a role in verbal working memory mechanisms (Ravizza et al., 2004Go) which may be engaged during the period during which subjects delay their spoken response.

The conjunctions additionally included motor-related areas (primary motor cortex, putamen, ventral thalamus and cerebellum) that might reflect subvocalization — or inhibition of vocalization — that may take place during the delay, or could indicate articulatory processes implicitly activated prior to the cued response.

In a general sense, the fact that there are conjunctions at all — i.e. regions activated throughout the entire process of word generation — may be considered problematic. After all, our paradigm was designed to separate theoretically discrete stages of word production and to identify regional activations independently associated with each of these. How are we then to interpret the activations found in the conjunction analysis? Are these areas in fact active during both early and late stages of lexical access? Or can these stages and the regions that support them simply not be effectively dissociated because of technical limitations?

For example, as noted in the Introduction, it may be impossible to completely separate the theoretical stages of Levelt's model within the temporal constraints imposed by event-related fMRI; the resolution of the method itself might make effective separation impossible. We approached the problem using a finite (2 s) TR and an obligatorily longer (3 s) delay. It may simply not be possible to separate processes that occur on an inherently finer-grained timescale. Similar studies using methods such as EEG/ERP or MEG would potentially offer complementary temporal information that event-related fMRI is not capable of providing.

Another possible difficulty with our design, alluded to above, is that the delay — any delay — may not prevent activation of processes related to the later stages of form-encoding and articulation during the earlier stages. It is possible that certain steps — e.g. phonological encoding — occur implicitly and cannot be effectively postponed by imposing a delay.

Moreover, the delay may also provoke activation of regions that support additional cognitive processes — e.g. working memory, anticipation, response inhibition, self-monitoring — that are not present during natural everyday speech per se, but may be selectively engaged in this paradigm.

The reaction times we obtained in the behavioral experiment do support the idea that a separation of the processes was to some degree accomplished by introducing a cue to delay the overt naming. Our reaction times were consistently faster (mean 368 ms) than reaction times typically reported in other confrontational naming studies; for example, Indefrey and Levelt (2004)Go state that a ‘typical total duration of lexical preparation...from picture onset to the initiation of the naming response...[is] typically some 600 ms’. This suggests that portions of the word production cascade executed in the early phase (before the cue) are not repeated after the cue is presented. That is, these times are consistent with the processing cascade having stopped at an intermediate stage and resumed at the cue.

Another possibility exists as well: perhaps our ‘failure’ to clearly isolate the substages of Levelt's model is only apparent because the assumption that they can be dissociated anatomically is wrong: the inability to disambiguate may not be a limitation caused by technical or design constraints, but may reflect the way the brain actually works.

That there are distinct early and late stages of lexical access is clear, i.e. the psycholinguistic stages themselves are psychologically dissociable. However, the idea that certain language areas will play a role in the early stages only and that another distinct set of regions will be engaged during the later stages may be largely incorrect.

That may be a more accurate notion in the case of the contrasts, where the earliest and latest processes associated with retrieval and articulation respectively engage systems with which language interacts — visual, conceptual, sensorimotor.

But the perisylvian language cortices themselves may naturally be represented in the conjunctions, because these regions play an integral role in processes that occur in both the early and late stages of lexical access.

A number of examples of this have been cited in the meta-analysis reported by Indefrey and Levelt (2004)Go: the frontal operculum appears to support functions ranging from semantics to syllabification. The middle temporal gyrus — albeit different portions of this region — plays a role in conceptually driven lexical access as well as phonological code retrieval. Indeed, both regions have been implicated in a wide variety of linguistic as well as non-linguistic operations (Cabeza and Nyberg, 2000Go). In other words, there may not be discrete brain systems that isomorphically map onto each of the individual stages of Levelt's model.

Instead, regions that make up the core language system may perform computations that support generative properties at all linguistic levels, that underlie early lexical access as well as phonological and phonetic encoding. The elements of the core language system may thus be naturally active during both portions of the experiment.

On the other hand, our results may fit a simpler model, in which the stages of word generation are discrete, as described in Levelt's model, but a core network of perisylvian regions performs computations associated with each. This core set of regions would be active throughout, interacting with other systems related to language use — visual, conceptual, auditory and motor — at either end of the naming process. As noted, this is consistent with the psychological model that emerges from Chomsky's minimalist program (Chomsky, 1995Go).

This core language system would thus be expected to be highlighted in the conjunction analysis, while interactions of this system with conceptual-intentional and articulatory-perceptual systems should be identified by the contrasts.

Retrieval Minus Articulation – Interaction of Language and Conceptual-intentional Systems

The differences detected in this contrast identified areas in which activations were greater in magnitude for retrieval than articulation, i.e. selectively associated with the earliest stages of lexical access, beginning with the presentation and processing of the visual stimulus, representing the point at which the language system and conceptual-intentional systems interact.

In this experimental paradigm, visual object recognition should represent the first step in conceptual preparation. Unsurprisingly, this contrast showed prominent activation of occipital and occipitotemporal cortices, consistent with previous studies of visual image processing. Strong activation was found in the primary visual cortex and more specialized visual association areas in the ventral occipitotemporal cortex (McKeefry and Zeki, 1997Go). Activations in the lateral occipital and basal temporal areas are considered to reflect the role of these regions in object recognition processes (Martin et al., 1996Go; Kanwisher et al., 1997Go; Downing et al., 2001Go; Grill-Spector, 2003Go).

While the regions of the ventral visual pathway are activated by object recognition, the dorsal occipitoparietal pathway processes information about spatial localization (Haxby et al., 1991Go). We observed retrieval-related activation in the terminal portions of this pathway, in the SPL and IPS, which might reflect the visual search performed by the subjects as they tracked visual targets in the scenes presented to them (Corbetta et al., 1995Go).

The fusiform gyrus has been shown to be involved in linguistic tasks that extend beyond the scope of pure object recognition (Moore and Price, 1999Go). As discussed by Price and Devlin (2003)Go, activation in the left midfusiform area is observed when subjects hear, repeat or think about the meaning of words. Perceptual and semantic processing in distinct parts of the fusiform gyri (Simons et al., 2003Go) suggest that this region may play a role in both visual object recognition and semantic processing, perhaps making the visual percept as well as its semantic associations available as input to conceptual preparation.

We also detected retrieval-related activity in posterior heteromodal association areas at the temporoparietal-occipital junction, including the angular gyrus. Involvement of this region is not surprising. Lesions of the angular gyrus have been shown to produce a classic anomia (Geschwind et al., 1997Go), i.e. difficulty naming on confrontation without any deficits in semantic knowledge. Such patients show a selective inability to retrieve words (they have word-finding difficulties in spontaneous speech and cannot name objects, but will pick out the correct name if this is presented to them). All of this suggests that the region may play a role in lexical selection — associating the concept, the action or the object represented in the picture with the corresponding verb or noun.

In the frontal lobe, we observed activation of the lateral inferior frontal cortex extending from the dorsal operculum (BA 44) to the precentral gyrus (BA 6). Bookheimer (2002)Go, in a review of studies that examined the role of the IFG in language processing, reported that this multifunctional region has been activated in semantic tasks, consistent with its activation during the earlier stages of lexical access in our study.

Another interpretation is possible given the fact that dorsal portion of the operculum (BA 44,6) is considered a constituent of the lateral premotor system (Roland and Zilles, 1996Go). As such, retrieval-related activation might reflect an instance in which motor regions play a role in the processes of scanning and selection during early lexical access, an example of the interaction between language and motor systems at the interface between cognition and action (Georgopoulos, 2000Go).

Consistent with this notion, we saw activation of additional motor-related areas that previous studies have suggested may be involved in word retrieval (Heun et al., 2000Go). The pre-SMA has been considered to function as an interface between prefrontal cognitive and higher-order motor areas, differing from the SMA proper, which is more directly involved in the execution of complex motor programs (Picard and Strick, 2001Go). Consistent with this, we observed selective activation of the pre-SMA during retrieval, when this region might participate in selection of a lexical item with information derived from the frontal cortex.

The cerebellum might be similarly involved in early retrieval processes. In addition to its role in sensorimotor integration, the cerebellum has been shown to be active in a variety of linguistic tasks (Fiez, 1996Go; Gebhart et al., 2002Go). A number of studies (e.g. Roskies et al., 2001Go) have demonstrated selective activation of the posteromedial cerebellum during semantic tasks involving lexical retrieval, consistent with the activations we observed here.

Articulation Minus Retrieval – Interaction of Language and Articulatory- perceptual Systems

Differences observed in the articulationretrieval contrast, on the other hand, should pinpoint regions associated with the final stages of the naming task — areas that support encoding of sound and motor patterns, articulation of the target word and perception of the utterance, the point at which language and articulatory-perceptual systems interact.

Accordingly, this contrast highlighted frontal motor and superior temporal regions, which previous studies have shown to be related to motor aspects of articulation and auditory speech perception

While the pre-SMA was more active during retrieval, the SMA proper was selectively activated in the articulation phase. This is consistent with the proposed differences in the function of these medial premotor regions: as noted above, the pre-SMA may play a role at the borderland of cognitive and motor function; the SMA proper is more immediately involved in the direct execution of motor programs (Picard and Strick, 2001Go) such as articulation, via interactions with primary motor cortex. This portion of the SMA may thus play a role in phonetic encoding and may represent the site of an articulatory buffer (Klapp, 2003Go), which subsequently stores the elements of a selected motor program prior to articulation. The meta-analysis of Indefrey and Levelt (2004)Go found evidence that the SMA may participate in such functions.

Articulation was associated with widespread activation of the insula, extending along its anterior–posterior axis, and most robust in the midportion of both hemispheres. The insula appears to function, in part, as a premotor structure (Roland and Zilles, 1996Go). Activation of the left anterior insula has previously been reported in speech articulation studies (Roland and Zilles, 1996Go), and it has been argued that this region plays a central role in motor planning of speech (Dronkers, 1996Go).

Activation of the anterior cingulate cortex has been associated with aspects of motor control and movement generation. The dorsal portion (rostral cingulate zone or RCZ) in which activity was detected in this contrast has been linked to selection and initiation of motor programs (Picard and Strick, 2001Go). The ventral portion, in which we also found discrete foci of activity, has been associated specifically with speech-related oral motor activity (Paus et al., 1993Go).

The ‘articulation minus retrieval’ contrast also highlighted regions that exert more immediate control over the articulators. Thus, we observed bilateral activation of the primary motor and premotor cortex (BA 4,6) with distinct local maxima similar to those that appear to represent labial and lingual components of articulation (Lotze et al., 2000Go), as well as ventral sensorimotor regions that may correspond to pharyngeal-diaphragmatic activation (Wise et al., 1999Go). Similarly, activation of the cerebellar hemispheres, posterior pallidum, ventral thalamic nucleus and lateral premotor cortex has also frequently been reported in association with overt speech production (Wise et al., 1999Go; Blank et al., 2002Go).

As noted, in addition to regions involved in motor planning and articulation, we observed selective activation of auditory cortical areas consistent with fact that articulation engages both motor and perceptual systems, as subjects monitor their own vocal output.

Accordingly, we observed bilateral activation of primary auditory cortex in the transverse temporal gyri (TTG) extending into auditory association areas along the anterior STG, and left lateralized activation of the posterior STG extending into the planum temporale. These activations extended into the superior temporal sulcus and were most robust in the anterior portions of the STS in both hemispheres. These regions have previously been described as preferentially activated by speech and voice as opposed to more general auditory stimuli (Binder, 1999Go; Belin et al., 2000Go).


    Conclusions
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
Theoretical Considerations

Using event-related fMRI we have been able to differentiate and image discrete stages in the process of confrontational naming, roughly corresponding to the fundamental two-tiered architecture of Levelt's model for word generation, which distinguishes the participation of conceptual and sensorimotor systems in the formulation and production of spoken language.

We did not detect regional activations that were selectively associated with the detailed psycholinguistic substages that characterize more comprehensive versions of this model (e.g. LRM, Weaver++). As discussed, this could be due to technical restrictions (i.e. limited temporal resolution of hemodynamic imaging methods) or because it may in fact be impossible to isomorphically associate fine-grained linguistic processes with unique regional activations, given the multifunctionality of the perisylvian areas themselves (which may perform linguistic computations at multiple points throughout the naming process).

In any case, the conjunction analysis highlighted the core perisylvian language areas and these could not be further separated — they were active throughout the entire process. The contrasts, on the other hand, clearly showed that there are unique patterns associated with the earliest and latest stages of lexical access respectively: coactivation of visual and heteromodal association areas during the early stages, and of motor and auditory areas during the late stages.

These results may more naturally fit a simpler model, consistent with the one proposed by Chomsky in his ‘minimalist’ program (Chomsky, 1995Go). In this model there exists a core computational language system that supports the fundamental business of language, namely linking sound and meaning. We suggest that this core system — which constitutes the faculty of language in the narrow sense (Hauser et al., 2002Go) — is instantiated in the set of perisylvian regions that are active throughout the process of lexical access.

According to Chomsky, the use of language in the world — language performance — requires the interaction of this core system with external systems that are relevant to language use; this occurs at two interfaces levels. We suggest that these interface levels are specified by the contrasts, which demonstrate the interaction of the core language system with conceptual-intentional systems at the onset (retrieval–articulation) and with articulatory-perceptual systems at the conclusion (articulation–retrieval) of the naming process.

The interface levels proposed by Chomsky are functionally congruent with the primary features of the Levelt model — interaction of perisylvian and conceptual-intentional systems supporting conceptual preparation and lexical selection, interaction of perisylvian and articulatory-perceptual supporting phonological encoding, articulation and self-monitoring.

Taken together, core linguistic, conceptual and sensorimotor systems constitute the faculty of language in the broad sense (Hauser et al., 2002Go). The elements of this larger system and their points of interface may be effectively disambiguated by our imaging paradigm.

Practical Considerations

If we have shown that it is possible at the very least to differentiate the earliest stages of lexical retrieval from the late stages of articulation, our paradigm may ultimately be of pragmatic value: an fMRI design that permits independent evaluation of these stages in a single study should be potentially useful in studying clinical disorders characterized by loss of function at either of these levels.

For example, Broca's aphasia is frequently characterized both by word finding deficits (which may manifest during our ‘retrieval phase) and articulatory impairments or apraxia of speech (which may manifest during ‘articulation’).

In addition, some of the possible methodological confounds of this paradigm as noted above — activation of cognitive processes during the cued delay that are not part of natural everyday speech — may not be relevant in studies in aphasia. Here a delay in the articulated response would be expected — it is a pathophysiological feature of the disorder — rather than superimposed. Patients should be naturally engaged in the cognitive task of interest throughout (although performance should be impaired) without confounds that might accompany an enforced delay in control subjects.

The ability to follow aphasic patients using this sort of paradigm could help identify the central correlates of distinct pathological deficits and the ways in which these may change independently in the course of language recovery.

Pathophysiological features could be similarly differentiated in other speech-language disorders with deficits in the early and late stages of lexical access and word production. These would include Alzheimer's disease and other dementias, developmental stuttering (in which motor symptoms are exacerbated during spontaneous lexical access) and Parkinson's disease (in which patients show deficits in verbal fluency as well as articulation).

The present study has demonstrated the general feasibility of such a paradigm. To evaluate these specific clinical populations, the paradigm would need to be modified, adding experimental manipulations that would more explicitly match specific deficits. The results obtained might serve as endpoints with which to evaluate therapeutic interventions.


    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusions
 References
 
Abdullaev YG, Posner MI (1998) Event-related brain potential imaging of semantic encoding during processing single words. Neuroimage 7:1–13.[CrossRef][Web of Science][Medline]

Baddeley A, Emslie H, Nimmo-Smith I (1994) Doors and people manual. Thurston: Thames Valley Test Company.

Barch DM, Sabb FW, Carter CS, Braver TS, Noll DC, Cohen JD (1999) Overt verbal responding during fMRI scanning: empirical investigations of problems and potential solutions. Neuroimage 10:642–657.[CrossRef][Web of Science][Medline]

Belin P, Zatorre RJ, Lafaille P, Ahad P, Pike B (2000) Voice-selective areas in human auditory cortex. Nature 403:309–312.[CrossRef][Medline]

Benton AL, Hamsher Kd, Sivan AB (1978) Multilingual aphasia examination. Lutz: Psychological Assessment Resources, Inc.

Binder JR (1999) Functional MRI of the language system. In: Functional MRI (Moonen CTW, Bandettini PA, eds), pp. 407–419. Berlin: Springer-Verlag.

Birn RM, Bandettini PA, Cox RW, Shaker R (1999) Event-related fMRI of tasks involving brief motion. Hum Brain Mapp 7:106–114.[CrossRef][Web of Science][Medline]

Blank SC, Scott SK, Murphy K, Warburton E, Wise RJ (2002) Speech production: Wernicke, Broca and beyond. Brain 125:1829–1838.[Abstract/Free Full Text]

Bookheimer S (2002) Functional MRI of language: new approaches to understanding the cortical organization of semantic processing. Annu Rev Neurosci 25:151–188.[CrossRef][Web of Science][Medline]

Cabeza R, Nyberg L (2000) Imaging cognition II: An empirical review of 275 PET and fMRI studies. J Cogn Neurosci 12:1–47.[Web of Science][Medline]

Chomsky N (1995) The minimalist program. Cambridge, MA: MIT Press.

Corbetta M, Shulman GL, Miezin FM, Petersen SE (1995) Superior parietal cortex activation during spatial attention shifts and visual feature conjunction. Science 270:802–805.[Abstract/Free Full Text]

Downing PE, Jiang Y, Shuman M, Kanwisher N (2001) A cortical area selective for visual processing of the human body. Science 293:2470–2473.[Abstract/Free Full Text]

Dronkers NF (1996) A new brain region for coordinating speech articulation. Nature 384:159–161.[CrossRef][Medline]

Fiez JA (1996) Cerebellar contributions to cognition. Neuron 16:13–15.[CrossRef][Web of Science][Medline]

Friston KJ, Holmes AP, Price CJ, Buchel C, Worsley KJ (1999) Multisubject fMRI studies and conjunction analyses. Neuroimage 10:385–396.[CrossRef][Web of Science][Medline]

Gebhart AL, Petersen SE, Thach WT (2002) Role of the posterolateral cerebellum in language. Ann NY Acad Sci 978:318–333.[CrossRef][Web of Science][Medline]

Georgopoulos AP (2000) Neural aspects of cognitive motor control. Curr Opin Neurobiol 10:238–241.[CrossRef][Web of Science][Medline]

Geschwind N, Devinsky O, Schachter SC (1997) Norman Geschwind: selected publications on language, epilepsy, and behavior. Boston, MA: Butterworth-Heinemann.

Grill-Spector K (2003) The neural basis of object perception. Curr Opin Neurobiol 13:159–166.[CrossRef][Web of Science][Medline]

Hauser MD, Chomsky N, Fitch WT (2002) The faculty of language: what is it, who has it, and how did it evolve? Science 298:1569–1579.[Abstract/Free Full Text]

Haxby JV, Grady CL, Horwitz B, Ungerleider LG, Mishkin M, Carson RE, Herscovitch P, Schapiro MB, Rapoport SI (1991) Dissociation of object and spatial visual processing pathways in human extrastriate cortex. Proc Natl Acad Sci USA 88:1621–1625.[Abstract/Free Full Text]

Heim S, Friederici AD (2003) Phonological processing in language production: time course of brain activity. Neuroreport 14:2031–2033.[CrossRef][Web of Science][Medline]

Heun R, Jessen F, Klose U, Erb M, Granath D, Freymann N, Grodd W (2000) Interindividual variation of cerebral activation during encoding and retrieval of words. Eur Psychiatry 15:470–479.[CrossRef][Web of Science][Medline]

Indefrey P, Levelt WJ (2004) The spatial and temporal signatures of word production components. Cognition 92:101–144.[CrossRef][Web of Science][Medline]

Kanwisher N, McDermott J, Chun MM (1997) The fusiform face area: a module in human extrastriate cortex specialized for face perception. J Neurosci 17:4302–4311.[Abstract/Free Full Text]

Kaplan E, Goodglass H, Weintraub S (2001) Boston naming test. Baltimore, MD: Lippincott Williams & Wilkins.

Klapp ST (2003) Reaction time analysis of two types of motor preparation for speech articulation: action as a sequence of chunks. J Mot Behav 35:135–150.[Web of Science][Medline]

Levelt WJM (1989) Speaking: from intention to articulation. Cambridge, MA: MIT Press.

Levelt WJ (2001) Spoken word production: a theory of lexical access. Proc Natl Acad Sci USA 98:13464–13471.[Abstract/Free Full Text]

Levelt WJ, Roelofs A, Meyer AS (1999) A theory of lexical access in speech production. Behav Brain Sci 22:1–38; discussion 38–75.

Lotze M, Seggewies G, Erb M, Grodd W, Birbaumer N (2000) The representation of articulation in the primary sensorimotor cortex. Neuroreport 11:2985–2989.[Web of Science][Medline]

Martin A, Wiggs CL, Ungerleider LG, Haxby JV (1996) Neural correlates of category-specific knowledge. Nature 379:649–652.[CrossRef][Medline]

McKeefry DJ, Zeki S (1997) The position and topography of the human colour centre as revealed by functional magnetic resonance imaging. Brain 120:2229–2242.[Abstract/Free Full Text]

Moore CJ, Price CJ (1999) Three distinct ventral occipitotemporal regions for reading and object naming. Neuroimage 10:181–192.[CrossRef][Web of Science][Medline]

Palmer ED, Rosen HJ, Ojemann JG, Buckner RL, Kelley WM, Petersen SE (2001) An event-related fMRI study of overt and covert word stem completion. Neuroimage 14:182–193.[CrossRef][Web of Science][Medline]

Paus T, Petrides M, Evans AC, Meyer E (1993) Role of the human anterior cingulate cortex in the control of oculomotor, manual, and speech responses: a positron emission tomography study. J Neurophysiol 70:453–469.[Abstract/Free Full Text]

Picard N, Strick PL (2001) Imaging the premotor areas. Curr Opin Neurobiol 11:663–672.[CrossRef][Web of Science][Medline]

Price CJ, Devlin JT (2003) The myth of the visual word form area. Neuroimage 19:473–481.[CrossRef][Web of Science][Medline]

Randolph C (1998) Repeatable battery for the assessment of neuropsychological status (RBANS). San Antonio: The Psychological Corporation.

Ravizza SM, Delgado MR, Chein JM, Becker JT, Fiez JA (2004) Functional dissociations within the inferior parietal cortex in verbal working memory. Neuroimage 22:562–573.[CrossRef][Web of Science][Medline]

Roland PE, Zilles K (1996) Functions and structures of the motor cortices in humans. Curr Opin Neurobiol 6:773–781.[CrossRef][Web of Science][Medline]

Roskies AL, Fiez JA, Balota DA, Raichle ME, Petersen SE (2001) Task-dependent modulation of regions in the left inferior frontal cortex during semantic processing. J Cogn Neurosci 13:829–843.[CrossRef][Web of Science][Medline]

Simons JS, Koutstaal W, Prince S, Wagner AD, Schacter DL (2003) Neural mechanisms of visual object priming: evidence for perceptual and semantic distinctions in fusiform cortex. Neuroimage 19:613–626.[CrossRef][Web of Science][Medline]

Wise RJ, Greene J, Buchel C, Scott SK (1999) Brain regions involved in articulation. Lancet 353:1057–1061.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BrainHome page
A. T.-D. Fonseca, E. Guedj, F-X. Alario, V. Laguitton, O. Mundler, P. Chauvel, and C. Liegeois-Chauvel
Brain regions underlying word finding difficulties in temporal lobe epilepsy
Brain, October 1, 2009; 132(10): 2772 - 2784.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc AHome page
S. B. Eickhoff, S. Heim, K. Zilles, and K. Amunts
A systems perspective on the effective connectivity of overt speech production
Phil Trans R Soc A, June 13, 2009; 367(1896): 2399 - 2421.
[Abstract] [Full Text] [PDF]


Home page
BrainHome page
J. DeLeon, R. F. Gottesman, J. T. Kleinman, M. Newhart, C. Davis, J. Heidler-Gary, A. Lee, and A. E. Hillis
Neural regions essential for distinct cognitive processes underlying picture naming
Brain, May 1, 2007; 130(5): 1408 - 1422.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
16/4/587    most recent
bhj006v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (14)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kemeny, S.
Right arrow Articles by Braun, A. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kemeny, S.
Right arrow Articles by Braun, A. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?