Skip Navigation


Cerebral Cortex Advance Access originally published online on September 1, 2004
Cerebral Cortex 2005 15(3):303-316; doi:10.1093/cercor/bhh132
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
15/3/303    most recent
bhh132v2
bhh132v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (22)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zarahn, E.
Right arrow Articles by Stern, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zarahn, E.
Right arrow Articles by Stern, Y.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Cerebral Cortex V 15 N 3 © Oxford University Press 2004; all rights reserved

Positive Evidence against Human Hippocampal Involvement in Working Memory Maintenance of Familiar Stimuli

Eric Zarahn, Brian Rakitin, Diane Abela, Joseph Flynn and Yaakov Stern

Cognitive Neuroscience Division, Taub Institute, P & S Box 16, 630 West 168th Street, Columbia University, New York, NY 10032, USA

Address correspondence to Eric Zarahn, Cognitive Neuroscience Division, Taub Institute, P & S Box 16, 630 West 168th Street, Columbia University, New York, NY 10032, USA. Email: ez84{at}columbia.edu.


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 References
 
Subjects (n = 40) performed a delayed item recognition task for visually presented letters with three set sizes (1, 3 or 6 letters). Accuracy was close to ceiling at all set sizes, so we took set size as a proxy for WM load (i.e. the amount of information being maintained in WM). Functional magnetic resonance imaging (fMRI) signal associated with the delay period increased in a nearly linear fashion with WM load in the left inferior frontal gyrus/anterior insula (possibly Broca's area, BA 44/45), right anterior insula, bilateral caudate, bilateral precentral gyrus (BA 6), bilateral middle frontal gyrus (BA 9/46), bilateral inferior parietal lobule (with foci in both BA 39 and 40), left superior parietal lobule (BA 7), medial frontal gyrus (BA 6), anterior cingulate gyrus (BA 32) and bilateral superior frontal gyrus (BA 8). These results lend support to the idea that at least some of the cortical mechanisms of WM maintenance, potentially rehearsal, exhibit a scaling with WM load. In contrast, the delay-related fMRI signal in hippocampus followed an inverted U-shape, being greatest during the intermediate level of WM load, with relatively lower values at the lowest and highest levels of WM load. This pattern of delay-related fMRI activity, orthogonal to WM load, is seemingly not consonant with a role for hippocampus in WM maintenance of phonologically codable stimuli. This finding could possibly be related more to the general familiarity of the letter stimuli than their phonological codability per se.

Key Words: hippocampus • memory load • parietal cortex • prefrontal cortex • working memory


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 References
 
As assessed with delayed response and delayed-match-to-sample tasks, prefrontal cortex (PFC) seems to play a necessary role in working memory (WM) maintenance in non-human primates (Goldman and Rosvold, 1970Go; Bauer and Fuster, 1976Go; Passingham, 1985Go; Funahashi et al., 1993Go; Quintana and Fuster, 1993Go). However, there is currently debate over the extent to which human PFC, in particular, dorsolateral PFC (DLPFC/BA 9 and 46) is necessary for WM maintenance. While the results of some neuropsychological studies have been interpreted as supporting a necessary role for PFC in WM maintenance (Freedman and Oscar-Berman, 1986Go; Verin et al., 1993Go), others have been interpreted as implying that human DLPFC is only necessary for monitoring of information within working memory or attentional processing that is critical when competing information or interference is present (Malmo, 1942Go; D'Esposito and Postle, 1999Go; Petrides, 2000Go), and not maintenance per se.

The delay period of a delayed-(non)match-to-sample (or delayed item recognition) task is the trial phase during which subjects must maintain information about a previously presented stimulus in order to perform at above chance levels. Both dorsolateral and ventrolateral PFC in humans have been reported to show sustained neurophysiological activity during the delay period (Courtney et al., 1997Go; Manoach et al., 1997Go; Postle and D'Esposito, 1999Go; Rypma and D'Esposito, 1999Go; D'Esposito et al., 2000Go; Jha and McCarthy, 2000Go; Zarahn et al., 2000Go; Veltman et al., 2003Go). Furthermore, the delay period neurophysiological activity in parts of dorsolateral and ventrolateral PFC has been reported to be monotonically related to experimental factors thought to selectively vary WM load, i.e. the amount of information ostensibly being stored in WM (Manoach et al., 1997Go; Glahn et al., 2002Go; Rypma et al., 2002Go; Veltman et al., 2003Go). The PFC is not unique in this regard, as similar WM load dependence of brain activity has also been reported for other areas, including parietal cortex (Manoach et al., 1997Go; Veltman et al., 2003Go). However, not all neuroimaging data unambiguously support a role for DLPFC in WM maintenance. For example, Postle et al. (1999)Go observed evidence for WM manipulation sensitivity in DLPFC in 5/5 subjects. In contrast, WM load sensitivity in DLPFC was observed in only 2/5 subjects, while evidence for WM load sensitivity in left perisylvian cortex was seen in 5/5 subjects, suggesting that DLPFC is only weakly involved in pure WM maintenance (Postle et al., 1999Go). Similarly, another study, while supporting a role for DLPFC in WM maintenance, suggested a larger role for this cortical region in manipulation of information within WM (D'Esposito et al., 1999Go). Rypma and colleagues interpreted their result of a correlation in DLPFC between WM load and fMRI signal attributable to the delay period of a delayed item recognition task as an indication that DLPFC plays a role in strategic memory organization (as opposed to WM maintenance). Based on neuroimaging data, Owen et al. (1996)Go theorized that DLPFC is engaged only when WM manipulation is required. There have been multiple studies showing activation of DLPFC during the performance of n-back tasks (Braver et al., 1997Go; Druzgal and D'Esposito, 2001Go; Glahn et al., 2002Go; Veltman et al., 2003Go), in which WM manipulation and maintenance demands are confounded. So, while there is little doubt that WM manipulation is associated with DLPFC activation in humans, there is arguably less certainty in the field about whether this brain region is also involved in the construct of pure WM maintenance.

In the current study, our aim was to characterize the multiplicity and nature of the spatial patterns of WM maintenance-associated brain activity that are modulated by WM load. Towards this end, neurophysiological responses temporally associated with the delay period of a delayed item recognition task for visually presented letters were measured with blood-oxygenation-level-dependent functional magnetic resonance imaging (BOLD fMRI). Measurement of the fMRI response attributable to the delay period should have provided a highly enriched measurement of neural activity associated with WM maintenance (Fuster et al., 1982Go). Set size (our intended manipulation of WM load; see below for rationale) was varied across trials. As the task stimuli were visually presented, and there was no articulatory interference during the delay, the task was thought to primarily tap phonological WM (Baddeley, 1986Go), and to not require manipulation or monitoring of information (D'Esposito et al., 1999Go). As the delay period was only 7 s, maintenance of the trial-unique information was not thought to require long-term memory [LTM] (Drachman and Arbit, 1966Go; Cave and Squire, 1992Go; Alvarez et al., 1994Go).

It can be argued that a monotonic relationship between the degree of engagement of a given cognitive process and the intensity of neurophysiological activity in a brain region is strong inductive evidence of their being mechanistically related (Braver et al., 1997Go; Beauchamp et al., 2001Go). If one hypothetically manipulates WM load, one would by definition vary the degree of engagement of WM maintenance. Hence, a correlation between WM load and the intensity of delay period neurophysiologic activity [as measured with fMRI (Logothetis et al., 2001Go)] in some brain region would support that this activity is somehow related to WM maintenance. But, as WM load (at least as we define the term here) is simply a description of the degree of engagement of the cognitive process of WM maintenance, it cannot be manipulated directly. One might think that by simply increasing set size, one is necessarily varying WM load. This need not be so as delayed item recognition accuracy might decrease as set size increases such that the total amount of information maintained in WM (i.e. WM load) remains constant across different set sizes. For example, this would certainly be expected once one exceeds the buffer capacity of WM (Cowan, 2001Go; Vogel et al., 2001Go). However, as the amount of information that must be maintained in WM to achieve a given level of delayed item recognition accuracy is proportional to set size, a constant level of accuracy across set sizes would be consistent with a positive relationship between set size and WM load. Our logic in relating set-size to WM load, then, is the following: if delayed item recognition accuracy is relatively constant across set sizes (an assumption which we test), then one can take set size as a proxy for WM load.

Regarding the existence of WM load-related brain activity patterns, one hypothesis is that there is only a single WM load-related spatial pattern, and the expression of this pattern increases monotonically with WM load. As discussed above, this result would be consistent with the brain areas represented strongly in this single pattern being mechanistically related to, and perhaps mediating, WM maintenance. From past results, this single spatial pattern would be expected to weight heavily bilateral DLPFC, left inferior PFC and bilateral parietal cortices (Manoach et al., 1997Go; Rypma et al., 1999Go; Veltman et al., 2003Go). A broad, competing hypothesis is that the relationship between WM load and delay period fMRI activity manifests more than one spatial pattern in the brain. A second pattern would suggest either (i) the existence of multiple WM maintenance-related systems with different load sensitivities or (ii) the presence of WM irrelevant brain activity that is nevertheless spuriously dependent on WM load. If delay period fMRI signal were orthogonal to WM load in the second component, then explanation (ii) (i.e. that areas in which this component is dominant are not involved in WM maintenance) would seem to be the most parsimonious and plausible. This is because, by the definition of WM load, the expression of this second component would then be orthogonal to the engagement of WM maintenance. In passing, we note that it is not a paradox to conceive of one variable being dependent on, yet orthogonal to, another (e.g. consider the relationship of the functions x2 and x on the interval [–1,1]).

The multiplicity of the spatial patterns relating WM load and delay period fMRI activity was assessed via the application of the multivariate linear modeling (MLM) theory of Worsley and colleagues, a statistical method involving singular value decomposition (SVD) of a Number of voxels x Number of effects of interest data matrix (Worsley et al., 1997Go). Like standard statistical parametric mapping (SPM), MLM involves voxel-wise application of the general linear model, but instead of statistically testing for effects of interest (e.g. a relationship with WM load) at each voxel, the statistical testing assesses the existence of any such effects simultaneously at all voxels. Thus, in the presence of spatially distributed effects, MLM will tend to have superior detection power compared with SPM voxel-wise testing (Worsley et al., 1995Go). Furthermore, MLM decomposes the effects of interest into mutually orthogonal spatial patterns, and statistically assesses the number of true spatial patterns. Thus, a second advantage of MLM over pure reliance on voxel-wise testing is that MLM affords explicit testing of hypotheses concerning the number of spatial patterns required to summarize the effects of interest. It is for these two reasons (having superior power and allowing us to test the competing hypotheses concerning the number of WM load-related spatial patterns) that we chose MLM over the more standard SPM approach. A disadvantage of MLM compared with SPM is that the latter provides formal, statistical tests of spatial localization, while the former does not. However, MLM does provide descriptive localization results in the form of spatial patterns.

Region-of-interest (ROI) approaches also provide formal spatial localization, albeit at a coarser scale than voxel-wise tests. A ROI approach was not used as the primary method because we wished to detect WM load dependence during the delay period without making strong assumptions about precisely where or at what spatial scale such effects might exist. Also, we wished to determine the number of spatial patterns associated with WM load dependence, information that standard ROI methods do not provide. However, we did use ROI methodology in post hoc analyses to see if we could disconfirm our primary findings obtained from MLM.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 References
 
Subjects

Forty healthy, young subjects (30 male and 10 female; mean age ± SD = 25.1 ± 3.9; mean years of education = 15.6 ± 1.5; all right handed), recruited from the Columbia University student population, participated in experiment 1. All subjects supplied informed consent. Volunteers were screened for psychiatric and neurologic illness via a questionnaire.

Behavioral Task

The behavioral task used was a delayed item recognition task for letters (Sternberg, 1966Go). Each trial lasted a total 16 s. Subjects were instructed to respond as accurately as possible. No feedback about their performance was given during the scanning session. The sequence of events within a delayed item recognition trial was as follows (Fig. 1): first, a 3 s presentation of a blank screen marked the beginning of the trial. Then, an array of one, three or six capital letters (the number of letters being the set size) was presented for 3 s, the subjects having been instructed to encode these letters. The geometry of the stimuli was a 2 x 3 array, regardless of set size, with asterisks acting as non-letter placeholders for set sizes 1 and 3 (Fig. 1). With the offset of the letter array, subjects were instructed to focus their gaze on the blank screen and hold the stimulus items in mind for a 7 s maintenance interval (i.e. the delay period). Finally, a probe letter (lowercase, centered in the field of view) appeared for 3 s. In response to the probe, subjects indicated by a button press whether or not the probe matched a letter in the study array (right index finger button press to indicate ‘yes’, left index finger button press to indicate ‘no’.)



View larger version (30K):
[in this window]
[in a new window]
 
Figure 1. The delayed item recognition task is schematized.

 
Each experimental block contained 10 trials at each of the three set sizes, with five true negative and five true positive probes per set size. BOLD fMRI data were acquired for three experimental blocks per subject, yielding a total of 30 experimental trials per set size per subject. Blank trials (presentation of a blank screen for 2 s, requiring no behavioral output) were pseudo-randomly interspersed between delayed item recognition trials to both provide a baseline condition for positive control purposes and reduce the likelihood of neurophysiological responses predictive of the beginning of trials. The pseudo-randomization of these blank trials was via a random-without-replacement scheme (thus, more than one blank trial could occur sequentially, leading to an effectively jittered inter-trial interval), with a total of 70 blank trials per block. The presentation of delayed item recognition trials of different set sizes was also pseudo-randomly sequenced via a random-without-replacement scheme. The duration of each block was 620 s. There were approximate one-minute breaks between blocks.

Subjects were trained on seven blocks of delayed item recognition trials on the evening prior to the acquisition of fMRI data, the first six of which were administered with feedback. The training session was conducted to reduce task-related skill learning during the course of fMRI scanning.

fMRI Data Acquisition

During the performance of each block of the delayed item recognition task, 207 T2*-weighted images, which are BOLD images (Kwong et al., 1992Go; Ogawa et al., 1993Go), were acquired with an Intera 1.5 T Phillips MR scanner equipped with a standard quadrature head coil, using a gradient echo echo-planar (GE-EPI) sequence [TE = 50 ms; TR = 3000 ms; flip angle = 90°; 64 x 64 matrix, in-plane voxel size = 3.124 mm x 3.124 mm; slice thickness = 8 mm (no gap); 17 trans-axial slices per volume]. Four additional GE-EPI excitations were performed before the task began, at the beginning of each run, to allow transverse magnetization immediately after radio-frequency excitation to approach its steady-state value; the image data for these excitations were purposely discarded. A T2-weighted, fast spin echo structural image was also acquired from each subject for spatial normalization purposes [TE = 100 ms; TR = 2000 ms; flip angle = 90°, 256 x 256 matrix; in-plane voxel size = 0.781 mm x 0.781 mm; slice thickness = 8 mm (no gap); 17 trans-axial slices per volume].

Task stimuli were back-projected onto a screen located at the foot of the MRI bed using an LCD projector. Subjects viewed the screen via a mirror system located in the head coil. Responses were made on a LUMItouch response system (Photon Control Company). Task onset was electronically synchronized with the MRI acquisition computer. Task administration and data collection (reaction time and accuracy) were controlled using PsyScope (Cohen et al., 1993Go).

fMRI Data Pre-processing

All image pre-processing and analysis was done using the SPM99 program (Wellcome Department of Cognitive Neurology) and other code written in MATLAB 5.3 (Mathworks, Natick, MA). The following steps were taken in turn for each subject's GE-EPI dataset: Data were corrected for the order of slice acquisition, using the first slice acquired in the TR as the reference. All GE-EPI images were realigned to the first volume of the first session. The T2-weighted structural image was then co-registered to the first EPI volume using the mutual information co-registration algorithm implemented in SPM99. This co-registered high-resolution image was then used to determine parameters (7 x 8 x 7 non-linear basis functions) for transformation into a Talairach standard space (Talairach and Tournoux, 1988Go) defined by the Montreal Neurologic Institute (MNI) template brain supplied with SPM99. This transformation was then applied to the GE-EPI data, which were re-sliced using sinc-interpolation to 2 mm x 2 mm x 2 mm.

fMRI Statistical Analysis

Time-series Modeling

The fMRI data analysis comprised two levels of voxel-wise general linear models (GLMs; Holmes and Friston, 1998Go). In the first-level GLM, the GE-EPI time-series were simultaneously modeled with regressors representing the expected BOLD fMRI response (implicitly, relative to the inter-trial interval baseline) to the delayed item recognition trial components of stimulus presentation, delay period, and probe presentation/response, separately for each set size. The regressors were constructed by convolutions of an indicator sequence (i.e. a train of discrete-time delta functions) representing delayed item recognition trial component onsets, an assumed BOLD impulse response function (as represented by default in SPM99) and a rectangular function of duration dictated by the duration of the relevant trial component (Zarahn, 2000Go). This led to nine predictors of interest at this GLM stage. Each of these nine parameter estimate images produced per subject were then intensity normalized (via voxel-wise division by the time-series mean) and spatially smoothed with an isotropic Gaussian kernel (full-width-at-half-maximum = 8 mm). The resulting images were used as the dependent data in a second-level, voxel-wise GLM (Holmes and Friston, 1998Go) and subjected to MLM (Worsley et al., 1997Go). While images corresponding to all conditions (i.e. all set sizes and all trial components) acted as dependent variables in the second-level model, only results concerning the delay period are presented in this paper. The effect of these additional dependent variables was to increase the error degrees of freedom at each voxel ({nu}).

This fMRI time-series modeling framework (Zarahn, 2000Go) is similar to that described in an earlier report that used shifted impulse response functions as regressors (Zarahn et al., 1997Go), except that the current approach assumes the durations of the neural responses temporally associated with each trial component; this extra assumption affords greater accuracy in attributing components of the fMRI response to the various trial components. An assumption of both approaches is linearity and time-invariance of the system that transforms neural activity to fMRI signal (Logothetis et al., 2001Go). For a full discussion of the assumptions of these related methods, see Zarahn (2000)Go. It is important to stress that neither this nor any other time-series modeling method can extract unbiased estimates of the neural response amplitude associated with any component of the trial (including the delay component) from fMRI data in an ‘assumption-free’ manner. However, given that the key assumptions stated above are reasonably satisfied, this method will yield nearly unbiased estimates of the neural response amplitudes associated with each trial component.

Sequential Latent Root Testing

In MLM, an SVD is performed on the de-correlated/whitened effects of interest, followed by sequential latent root testing (with {alpha} controlled at a desired level) to assess the number of latent spatial patterns of effects (Worsley et al., 1997Go). SVD decomposes a (data) matrix into components such that the first component explains the greatest amount of variance; the second explains the greatest amount of variance after accounting for the first, and so on. Each SVD component has an associated singular value (or equivalently, an eigenvalue, which is the square of the singular value), a number that indicates how much variance the component explains relative to noise. To statistically assess the number of true spatial patterns, a sequential latent root testing procedure (involving F-statistics) is used to compare these singular values to the magnitude of the unexplained data variability (Worsley et al., 1997Go).

F-statistic Degrees of Freedom for Sequential Latent Root Testing

For the assumptions and theoretical background of sequential latent root testing, see Worsley et al. (Worsley et al., 1997Go). Both the numerator and denominator degrees of freedom for the sequential latent root testing F-statistics are much larger than what are commonly seen in the behavioral sciences or neuroimaging. In part, this is because both numerator and denominator degrees of freedom depend on the so-called ‘effective spatial degrees of freedom’ (d) which is proportional to the volume of the imaging dataset and inversely proportional to the spatial smoothness of the errors (Worsley et al., 1997Go); the estimated d = 424.5 in our dataset. For the test for the existence of one or more spatial components, the numerator degrees of freedom = d x the number of effects of interest, and the denominator degrees of freedom = d x {nu} – (d – 1) x (4 x the number of effects of interest + 2 x {nu})/(the number of effects of interest + 2). The degrees of freedom for tests for the existence of additional components (i.e. beyond one) have related formulae (Worsley et al., 1997Go). The repeated measures covariance matrix was estimated at each voxel, and the spatial average of these estimates was used as the known observation error covariance matrix. The value of {nu} was estimated to be 164.9 from this matrix (Worsley and Friston, 1995Go), which is substantially less than the number of observations per voxel minus the rank of the design matrix (= 40 x 9 – 9 = 351); this is because of the correlation in the repeated measures (Worsley and Friston, 1995Go).

A potential basis for confusion regarding MLM is the source of covariance that one is examining. There are three components of covariance in the effects of interest at the second-level. These comprise (i) the spatial auto-covariance of the GLM errors (potentially emanating from various mechanisms, including image processing, but ultimately assumed to be well modeled by a Gaussian point spread function); (ii) the covariance between the estimation errors of the different effects of interest (caused by the structures of the contrasts defining the effects of interest, the structures of the first-level and second-level design matrices, and fMRI time-series auto-covariance); and (iii) the deterministic/systematic similarity between the true spatial patterns of the effects of interest. It is only this third component which is relevant to MLM; MLM accounts implicitly for the other two covariance sources.

Representations of SVD Components

Each SVD component can be visualized in two related ways. One is as a plot relating delay period fMRI signal to set size (set size possibly being a valid proxy for WM load, depending on how accuracy varies with set size). The second is as a spatial pattern (i.e. brain image) whose expression is modulated by set size in the way described by the corresponding delay period fMRI signal versus set size plot. These two representations provide qualitative descriptions of the detected patterns.

In the context of this report, the effects of interest in the MLM lie in the 2-dimensional contrast space spanned by the difference in delay period fMRI signal amplitude between (i) 1 letter and 3 letters, and (ii) 3 letters and 6 letters. Thus, the number of effects of interest = 2, and so the maximum number of true set size-related components was two (the minimum number always being zero). The sequential latent root test controlled {alpha} at 0.05. Note that as MLM concerns statistical inference on the number of patterns of effects, it assesses spatially omnibus null hypotheses, as opposed to assessing the map-wise significance of effects at each voxel. As the spatial patterns (scaled by their corresponding singular values) resulting from this approach are t-maps (Worsley et al., 1997Go), they were thresholded for descriptive purposes at a t value ({approx}z value as {nu} > 100) of 4 and a cluster size of 100 voxels. Likely cytoarchitectonic labels for local maxima in these thresholded patterns were obtained using MSU software (Positron Emission Tomography Lab of the Institute of the Human Brain, St.Petersburg, Russia; http://www.ihb.spb.ru/~pet_lab/MSU/MSUMain.html).


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 References
 
Behavior

As expected (Sternberg, 1966Go), mean reaction time was affected by set size [F(2,78) = 106.2, P < 0.0001], with the relationship being close to linear (R = 0.98), with a slope of 59 ms/letter (Fig. 2). Though this phenomenon might be somehow related to the nature of maintenance during the delay period (Jou, 2001Go), the comparison of the probe item with the elements of the initial stimulus set must occur after the delay period.



View larger version (10K):
[in this window]
[in a new window]
 
Figure 2. The relationship of reaction time (RT) to set size is plotted. The error bars reflect standard errors of the means from a regression model that included a subject factor and a categorical WM load factor. The line is a least squares fit.

 
Accuracy on the delayed item recognition task was very high across set sizes [percent correct averaged across set sizes = 97.4%; d' (Green, 1988Go) averaged across set sizes = 3.35]. There was no effect of set size on accuracy as assessed with either percent correct [F(2,78) = 1.24, P = 0.30] or d' [F(2,78) = 1.17, P = 0.32]. From the logic explicated in the Introduction, this implies that WM load increased with set size. Given this result, we take set size as a proxy for WM load, and so use the term ‘WM load’ in place of ‘set size’ when describing the neuroimaging data.

WM Load-related Patterns of Delay Period fMRI Signal

There were two significant patterns of delay period fMRI signal with respect to WM load [test for one or more components: F(849,34239) = 2.89, P < 0.0001; test for two components: F(425,22881) = 1.95, P < 0.0001]. The eigenvalues (which are directly related to the F-statistics; Worsley et al., 1997Go) for the first and second patterns were 3.88 and 1.97, respectively (under the null hypothesis, the eigenvalues are approximately unity). Thus, after accounting for noise (by subtracting 1 from each eigenvalue), the first pattern accounted for approximately twice as much WM load-related variance in the brain as the second. The first component was nearly linear with WM load (R = 0.99; this correlation coefficient is presented descriptively, and should not be interpreted statistically), while the second component was nearly orthogonal to WM load (R = 0.02; Fig. 3). This need not have been the case; even though different components from the same SVD would have to be orthogonal to each other (in the space of the SVD), it was mathematically possible that both could have been correlated up to an R value of 0.71 (= 0.51/2) with WM load.



View larger version (10K):
[in this window]
[in a new window]
 
Figure 3. The expressions of the two significant WM load-related spatial patterns are plotted. Solid diamonds: first component; hollow diamonds: second component; thinner line: least squares fit of WM load to component 1; thicker line: least squares fit of WM load to component 2. The scale of the y-axis is unitless, as the expression vectors have been normalized to unit magnitude. The mean expression across WM load is necessarily zero for both patterns, as this aspect of the data did not lie in the contrast space of interest.

 
The first component, which correlates with WM load, was expressed strongly in parts of the left inferior frontal gyrus/anterior insula (possibly Broca's area, BA 44/45), right anterior insula, bilateral caudate, bilateral precentral gyrus (BA 6), bilateral middle frontal gyrus [i.e. dorsolateral prefrontal cortex (DLPFC); BA 9/46], bilateral inferior parietal lobule (IPL; BA 39 and 40), left superior parietal lobule (BA 7), medial frontal gyrus (BA 6), anterior cingulate gyrus (BA 32) and bilateral superior frontal gyrus (area 8). Many of these areas have been implicated in WM function in previous neuroimaging studies (Braver et al., 1997Go; Manoach et al., 1997Go; Courtney et al., 1998Go; Jha and McCarthy, 2000Go; Glahn et al., 2002Go; Rypma et al., 2002Go; Veltman et al., 2003Go). In contrast, the second component, whose expression was orthogonal to WM load, was expressed strongly in medial temporal lobe structures, most so in bilateral hippocampal loci. There was no substantial expression of this second component in PFC or parietal cortices. Expressions of the two patterns are shown rendered on a brain representative of the MNI space in Figure 4. Selected coronal slices of the spatial patterns are shown in Figure 5.



View larger version (115K):
[in this window]
[in a new window]
 
Figure 4. A three-dimensional brain rendering of the two significant WM load-related spatial patterns (scaled to transform the pattern weights to t-values) is shown. The underlying structural image is the representative ‘single-subject’ rendered brain image provided with SPM99. Positive weights of the first spatial pattern (i.e. the one that monotonically increases with respect to WM load) are shown in red. Positive weights of the second spatial pattern, which is nearly orthogonal to WM load, are shown in green. For display purposes, these spatial patterns have been thresholded at t = 4, and a cluster size of 100 voxels (0.8 cm3). The intensity of color on the brain surface is the integral, along a path normal to the brain surface, of t-values which have been exponentially decayed (space constant = 14 mm) based on their depth from the brain surface.

 


View larger version (77K):
[in this window]
[in a new window]
 
Figure 5. Coronal slices through the spatial patterns of the two significant WM load-related components of delay period fMRI signal are shown. The patterns were thresholded using the same conventions as in Figure 4. The y-positions (in mm in MNI standard brain space) of the coronal slices are indicated to the left of each row, and were selected to illustrate the double dissociation of prefrontal and hippocampal component expressions.

 
We wished to assess if the WM load relationships of the identified spatial patterns accurately reflect the delay period fMRI responses at individual brain locations. To do this, we plotted (separately for each hemisphere; Fig. 6) the delay period fMRI response amplitude relative to the implicit baseline of the experiment (see Materials and Methods) within individual voxels of DLPFC, IPL and hippocampus showing the highest spatial expressions of their respective dominant patterns (i.e. pattern 1 for DLPFC and IPL, and pattern 2 for hippocampus). The data at each location are necessarily composed of a weighted sum of the two patterns, and theoretically do not have to exactly match their dominant patterns. But, it is evident from comparison of Figures 3 and 6 that the relationship between WM load and delay period fMRI signal at each of the locations (in both hemispheres) is very similar to their respective, dominant patterns.



View larger version (19K):
[in this window]
[in a new window]
 
Figure 6. Delay period fMRI signals at selected local maxima of spatial pattern expression, are plotted as a function of WM load. The dominant pattern for dorsolateral prefrontal cortex (DLPFC) and inferior parietal lobule (IPL) is pattern 1, and for hippocampus (Hipp) is pattern 2. The MNI space coordinates (in mm) of these locations are (a) L Hipp [–24 –22 –10], L DLPFC [–38 28 24], L IPL [–36 –50 42]; and (b) R Hipp [28 –28 –10], R DLPFC [38 42 28] , R IPL [36 –44 38]. The error bars indicate standard errors computed across subjects.

 
Figure 6 is suggestive of a region (DLPFC/IPL) x hemisphere interaction. Though uncorrected P-values are not valid when selecting an analysis to perform based on the appearance of data, we subjected these data to various ANOVAs for descriptive purposes (thus the term ‘significant’ is meaningful in these results only nominally). Averaged across loads, there was a significant regionxhemisphere interaction [F(1,39) = 35.52, P < 0.0001], such that the difference (DLPFC – IPL) in delay period signal between the selected DLPFC and IPL voxels was positive in the right hemisphere and negative in the left hemisphere. However, neither the WM load (reduced to a 1 df linear trend) x region x hemisphere [F(1,39) = 0.91, P = 0.35] nor the WM loadxregion interactions in either the right [F(1,39) = 1.47, P = 0.23] or left [F(1,39) = 0.003, P = 0.96] hemispheres were significant. Finally, there was no WM load x hemisphere interaction [F(1,39) = 1.34, P = 0.25]. The significant region x hemisphere interaction (for data collapsed across WM loads) is suggestive not of a hemispheric asymmetry in the regional patterns of WM load-related processing, but rather of a hemispheric difference in the relative ways DLPFC and IPL are involved with WM-load independent processing. More theoretical work is required to generate hypotheses concerning the relationship between WM load-dependent and WM load-independent neurophysiological activity. In contrast, the null results for all interactions involving WM load suggest that there are very similar delay period WM load dependencies in DLPFC and IPL, as well as across hemispheres. Taken strongly, these null results might be inconsistent with a lateralization (as regards these two regions) of WM-load dependent function for phonological material (Vallar et al., 1991Go), and would also not support that DLPFC is more involved in WM processing than IPL. However, another caveat of these particular results is that they concern particular voxels that were selected as having high expression of a pattern related to WM load, and so do not represent an unbiased sample of their parent neuroanatomical regions.

An additional observation from Figure 6 is that in the hippocampal voxel of interest of the right hemisphere, the delay period fMRI responses for set sizes 1 and 6 were significantly less than that observed during baseline, while the fMRI response at set size 3 was not significantly different from baseline. The delay period signal of the analogous voxel in the left hippocampus was significantly greater than baseline at size 3, but not significantly different from baseline when averaged across WM loads. The values relative to baseline are not an artifact or biasing effect of the method used to extract these patterns, which had nothing to do per se with the offsets of the delay period fMRI responses relative to baseline. This trend of non-positive delay period values in hippocampus is broadly consistent with the absence of WM-related hippocampal activity in the results of many whole-brain imaging studies (Courtney et al., 1996Go, 1997Go; Smith et al., 1996Go; Braver et al., 1997Go; Cohen et al., 1997Go; Rypma et al., 1999Go).

Generalization Test of Ranganath and D'Esposito

Ranganath and D'Esposito (2001)Go reported delay period fMRI signal change (relative to an inter-trial interval baseline) in the hippocampus bilaterally (fig. 2 from that report) in the context of a delayed item recognition task for trial-unique, novel faces. The fMRI pulse sequences, voxel sizes, and spatial smoothing kernels are similar between that study and ours. We examined the delay period fMRI signal values (relative to inter-trial interval baseline) in our data at the two hippocampal coordinates (in MNI space) reported by those authors. At uncorrected significance levels, neither of the coordinates manifested a delay period fMRI signal greater than baseline at any of the WM loads (Fig. 7). However, the right hippocampal coordinate did have significantly negative values at set sizes 1 and 6.



View larger version (15K):
[in this window]
[in a new window]
 
Figure 7. Delay period fMRI signals at the MNI coordinates closest to those reported by Ranganath and D'Esposito (2001)Go as manifesting positive delay period activity relative to baseline are plotted as a function of WM load; the coordinates lie in left and right hippocampus, respectively. Our voxel coordinates differ from those reported by Ranganath and D'Esposito by 1 mm in the y-dimension as our voxels were (nominally, after processing) 2 mm thick, thus centering the MNI coordinates of our voxels on multiples of 2 mm. This should cause no appreciable discrepancy given the degree of spatial smoothness of both of our datasets.

 
Post hoc Hippocampal ROI Analysis

Finally, to see if the hippocampal voxels strongly expressing the second component were somehow contradictory to the response of the hippocampus as a whole, we examined the delay period fMRI responses averaged over a hippocampal ROI (Fig. 8a), as defined anatomically by one of the authors (E.Z.) on the representative single subject T1-weighted MRI supplied with SPM99 (Fig. 8b). A hemisphere by WM load, repeated measures ANOVA (sphericity test for WM load factor: Mauchley's W = 0.98, P = 0.67) detected an effect of WM load [F(2,78) = 6.41, P = 0.003]. It can be seen from Figure 8a that the relationship with WM load has the appearance of a negative U-shaped component in both hemispheres; indeed, the quadratic WM load component was significant [F(1,39) = 8.78, P = 0.005]. There was also an effect of hemisphere [F(1,39) = 5.24, P = 0.03], but no WM load by hemisphere interaction [F(2,78) = 0.61, P = 0.55]. Thus, the negative U-shaped response to WM load was present in the hippocampus as a whole, and there was no detectable difference in its expression between hemispheres.



View larger version (52K):
[in this window]
[in a new window]
 
Figure 8. (a) Delay period fMRI signals, spatially averaged over left and right hippocampal ROIs, respectively, are plotted as a function of WM load. (b) Anatomically defined hippocampal ROIs (left and right) that were used to generate the data in (a) are illustrated as an overlay on the representative MNI brain supplied with SPM99. The ROI is indicated by brighter gray scale values. Each slice corresponds to a different z coordinate in MNI space.

 
Of course, given that we chose to analyze the hippocampal ROI based on the observation that certain parts of the hippocampus were identified by MLM as having a relationship with WM load, these WM load ANOVA P-values for the hippocampal ROI are not strictly valid. Hence, we interpret these ROI-level findings at a descriptive level, and would only have considered a negative result as providing strong information. Thus, that there was a (nominally) significant quadratic effect is interpreted by us as simply a failure to disconfirm a general hippocampal negative U-shaped response to WM load, as opposed to providing strong, independent support for such an effect over and above the MLM results.

Averaged across loads, the delay period signal was not significantly different from 0 for either left [t(39) = –0.41, two-tailed P = 0.68], right [t(39) = –1.89, two-tailed P = 0.07] or bilateral hippocampal ROIs [t(39) = –1.23, two-tailed P = 0.22]. This confirms the finding, presented earlier for selected hippocampal voxels, of no positive delay period fMRI response in hippocampus when averaging over WM loads.


    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 References
 
Here we reported the existence of two WM load-related spatial patterns identified through the use of MLM, a method not common employed in neuroimaging analysis. As a preamble to providing further interpretation of this finding, we will first describe how MLM results can be considered in the context of the far more typical statistical approach of SPM. Some of these points were mentioned previously, but bear repetition. Also, we want to make clear two superficial modifications in our implementation of MLM relative to the source paper (Worsley et al., 1997Go).

MLM Interpretation

Like SPM, MLM involves voxel-wise estimation of GLM parameters. Also in common with typical SPM analyses (Friston et al., 1995Go; Worsley et al., 1996Go), the correlations between the errors in different voxels are assumed to be caused by a Gaussian point spread function. In both SPM and MLM, one defines a number of effects of interest-dimensional contrast space (where in the trivial case, the number of effects of interest = 1). In typical (i.e. voxel-wise testing; Friston et al., 1996Go) SPM, the standard null hypothesis (i.e. that the effects of interest all equal zero) is then statistically assessed at each voxel by applying an appropriate threshold to the SPM{F} (or equivalently, to the SPM{t} if the number of effects of interest = 1). In MLM, however, voxel-wise significance tests are not per se performed. Instead, one statistically tests the number of latent spatial patterns in the effects of interest. The minimum number of components is 0; the maximum is the number of effects of interest. A finding of one or more components formally rejects the spatially omnibus null hypothesis. Thus, both SPM and MLM are based on exactly the same GLM effects and share the same global null hypothesis. But while SPM affords statistical inference about the effects of interest at each voxel, MLM tests for effects of interest at a spatially omnibus level. Thus, SPM affords formal statistical spatial localization; MLM does not.

Assuming a positive result with MLM, the number of significant components can range from 1 to number of effects of interest. This number informs as to the spatial structure of the effects of interest. For example, in the current study, there were two effects of interest: the difference in delay period fMRI signal amplitude between (i) 1 letter and 3 letters, and (ii) 3 letters and 6 letters. MLM detected two components. This means that the (true) spatial patterns associated with (i) and (ii) are not identical.

The implementation of MLM in the current paper is superficially different from that in the source paper (Worsley et al., 1997Go). First, the source paper applied MLM to a first-level (i.e. fMRI time-series) GLM, not across-subject data in a second-level model (Holmes and Friston, 1998Go). But like any statistical model, MLM is valid as long as its assumptions are satisfied. To wit, the applied Gaussian spatial smoothing is typically assumed to dominate the error spatial covariance structure in both single-subject and group SPM analyses (Holmes and Friston, 1998Go). The within-voxel (i.e. repeated measures) covariance matrix here was estimated from the entire dataset (Materials and Methods); Monte Carlo simulations supported that MLM theory performs quite well at controlling the sequential latent root test false positive rate when this matrix is estimated and {nu} is large (data not shown). There is evidence that the distribution (conditioned on subject) of voxel-wise, time-series errors in fMRI data are nearly Gaussian (Aguirre et al., 1998Go), and we feel it is reasonable to tentatively presume that this extends to across-subject effects as well. Also, our use of a relatively large number of subjects should make parametric statistical inference robust to modest violations of normality (Kirk, 1982Go). We therefore posit that the assumptions of the MLM were sufficiently satisfied in our implementation. Finally, the Worsley et al. (1997)Go paper did not explicitly present how to implement MLM using arbitrary contrasts; instead the contrasts comprising the effects of interest were single parameters of the GLM. The theoretical generalization of MLM to using arbitrary contrasts, instead of being restricted to individual GLM parameters, is almost immediate from the framework presented by Worsley and colleagues.

We feel that the MLM methodology of Worsley et al. is a powerful and natural extension of the GLM framework for structural and functional neuroimaging data. We feel its main theoretical strength is the explicit testing of hypotheses concerning similarities/differences of spatial patterns of effects. MLM also enjoys a high sensitivity to reasonable effect sizes, and, according to the simulations of Worsley et al. and our group (data not shown), it controls specificity very close to desired levels.

Dominant WM Load Effects are Monotonic

The delay period, WM load-related component explaining the most variance in brain activity was monotonic (and more precisely, very well described as linear) with WM load. One might presuppose that this monotonicity is related directly to the nearly linear relationship of reaction time to set size. This widely observed relationship is consistent with both a serial, exhaustive comparison of the probe letter with each letter in the original set (Sternberg, 1966Go) or with limited capacity, parallel search (Townsend, 1990Go). Regardless, a linear dependence of the duration of this memory search process on set size could not be used to causally explain the approximately linear dependence of delay period fMRI activity on WM load in the first SVD component. This is because the memory search must occur after the delay period, once the probe letter has been presented. Though it is beyond the scope of this report, neural correlates of memory search can be investigated by assessing WM load dependence during the probe presentation period.

The monotonic relationship of delay period fMRI signal with WM load in PFC, premotor and parietal cortices suggests that the intensity of synaptic processing in these areas (Logothetis et al., 2001Go) is proportional to WM load, informing as to neural mechanisms of WM maintenance. Parietal cortex and PFC are reciprocally anatomically connected (Divac et al., 1977Go; Petrides and Pandya, 1999Go), and in the non-human primate seem to be involved in a mutually dependent neural circuit during the delay period of delayed response tasks (Chafee and Goldman-Rakic, 2000Go; Quintana et al., 1989Go). Thus, from a neurophysiological perspective, it is not a priori unreasonable that delay period activities in both regions have a similar relationship to WM load.

Electrophysiological studies in humans suggest that phase locking of neural oscillations to the components of the task play a role in cortical WM processing (Raghavachari et al., 2001Go; Rizzuto et al., 2003Go). The mechanistic relationship between delay period fMRI activity (as measured in studies like the current one) and sustained neural spiking measured at the single cell level in non-human primates during WM tasks seems relatively clear, or at least highly plausible (Logothetis et al., 2001Go). In contrast, the relationship between fMRI signal presumably related to WM maintenance and the oscillatory phenomena cited above needs elucidation. This will require data from electrophysiological studies in humans examining the effect of WM load in a WM maintenance context, as opposed to WM tasks that involve the putative processes of manipulation and monitoring (McEvoy et al., 1998Go).

Maintenance of phonological information in WM is thought by some (Warrington and Shallice, 1969Go; Vallar and Baddeley, 1984Go; Baddeley et al., 2002Go) to involve an articulatory loop comprising subvocal rehearsal and phonological store subsystems (but see Nairne, 2002Go). Broca's area is thought to be involved in subvocal rehearsal, and the left supramarginal gyrus is thought to be involved in phonological storage (Paulesu et al., 1993Go). Thus, one might hypothesize that some of the WM load sensitive areas (the region in the vicinity of Broca's area and the left inferior parietal lobule/supramarginal gyrus in particular) are directly involved in the articulatory loop. While it is not immediately clear to us if the current theoretical account of the articulatory loop would predict greater neural activity per unit time with greater WM phonological storage loads (Cowan et al., 2003Go), these current results would be consistent that idea. Another possibility is that, as WM load sensitivity in DLPFC and other areas has also been observed for non-verbal materials (Glahn et al., 2002Go), perhaps some of the brain areas manifesting the first component engage in material-independent attentional processes (Postle and D'Esposito, 1999Go) that nevertheless scale with WM load. Also, one should be careful to generalize what is seen for a given delay duration (7 s in the current task) to longer delays, as Jha and McCarthy (2000)Go reported activity in PFC that was sustained throughout very long delay intervals (up to 24 s), but sensitivity to the amount of information in WM was only found in the early part of the delay. This suggests that perhaps pure WM maintenance, which might occur after a period of several seconds of consolidation, is not load dependent (Jha and McCarthy, 2000Go).

It is fairly well established that in non-human primates, DLPFC is necessary for even the simplest WM maintenance tasks (Goldman and Rosvold, 1970Go; Bauer and Fuster, 1976Go; Passingham, 1985Go; Funahashi et al., 1993Go; Quintana and Fuster, 1993Go). However, the precise role of PFC in WM processing in humans continues to be an area of great interest and disagreement. A meta-analysis of lesion data in humans led D'Esposito and Postle (1999)Go to conclude that while DLPFC might be necessary for delayed response tasks (which in their parlance included delayed-match-to-sample and delayed-nonmatch-to-sample tasks), it is not necessary for simple span tasks (forward digit span and block/Corsi span). These authors made an a priori distinction between the theoretical WM processes tapped by span tasks and delayed response tasks, with span tasks presumably relying more on storage and delayed response tasks relying more on rehearsal/maintenance. Therefore, they concluded that DLPFC in humans is necessary for rehearsal/maintenance, but not for storage, which they deduced was mediated by more posterior cortical regions. The delay period of a delayed item recognition task would require both storage and rehearsal according to standard models of WM maintenance, in which the stored trace is intermittently refreshed by rehearsal (Nairne, 2002Go).

Based on their interpretation of their meta-analytic findings, D'Esposito and Postle (1999)Go hypothesized that left ventrolateral PFC, but not left or right DLPFC, is necessary for verbal delayed response performance. To the extent that a delayed item recognition task for letters corresponds to a verbal delayed response task, either the linear DLPFC response to WM load we observed does not reflect necessity for task performance (while perhaps the homologous relationship in the left ventrolateral PFC locus does), or this result is at odds with their hypothesis. Moreover, ignoring their particular prediction concerning localization within PFC of verbal delayed response task mediation, if one assumes that the degree of rehearsal (i.e. the subvocal articulation rate) scales with WM load, then our results in PFC are consistent with D'Esposito and Postle's hypothesis concerning the role of PFC in WM (i.e. that it mediates rehearsal). However, if only storage, and not rehearsal, scales with WM load, then our results would be inconsistent with that general hypothesis. We could not find any data in the literature that speaks to dependence of rehearsal rate on WM load.

Based on neuroimaging evidence, some have argued that processing in human DLPFC is related less to WM maintenance and more (D'Esposito et al., 1999Go; Postle et al., 1999Go; Rypma et al., 1999Go; Glahn et al., 2002Go) or exclusively (Owen et al., 1996Go, 1999Go) to manipulation/organization of information within WM. Still, DLPFC WM load sensitivity has been reported for pure maintenance tasks, presumably relying on phonological storage and rehearsal, such as the one used in the current study (Manoach et al., 1997Go; D'Esposito et al., 1999Go; Rypma et al., 1999Go, 2002Go; Veltman et al., 2003Go), and recent data suggest that very similar if not identical PFC regions are involved in both WM maintenance and manipulation (Veltman et al., 2003Go).

Similar to our current study, Rypma et al. (2002)Go examined fMRI signal associated with different phases of a delayed item recognition task in which set size was varied from 1 to 8 letters, and reported a positive relationship between set size and delay period activation in DLPFC. They did not observe such an effect in their earlier studies (Rypma and D'Esposito, 1999Go, 2000Go), in which they instead noted a relationship of set size to stimulus encoding period activation in DLPFC (a result which itself was not replicated in their 2002 study). They attributed these across-study differences post hoc potentially to their using set sizes of only 2 and 6 in their earlier studies (Rypma and D'Esposito, 1999Go, 2000Go), but no particular argument was put forward to mechanistically explain why or how this would have changed the results in the manner observed. They also suggested that subjects were using a different strategy to perform the current task in their 2002 study compared with their earlier studies. But it is not clear whether they meant that the subjects from the two studies were drawn from different populations, or if their task was somehow sufficiently different (now including a greater array of set sizes and an expansion of the set size range to include 7 and 8 letters) to engender a strategy different from that used in their earlier studies. Nevertheless, the results of their 2002 study were used to draw the same conclusion as the one drawn from their earlier studies, namely that DLPFC plays a role in strategic memory organization. Another possible interpretation of their 2002 result is that DLPFC processing related to set-size variation simply reflects scaling of processes related to WM maintenance, such as rehearsal or perhaps even storage [(even though the latter would go against the hypothesis of D'Esposito and Postle (1999)Go]. But again, this is not to say that processing in DLPFC is in every task context related only to WM maintenance (D'Esposito et al., 1999Go; Owen et al., 1999Go; Postle et al., 1999Go).

At the moment, it is probably most accurate to say that there is no consensus as to the precise role of PFC in WM in humans, and that two broad, competing hypotheses prevail: (i) a WM maintenance role for ventrolateral PFC and a monitoring/manipulation role for DLPFC (D'Esposito et al., 1999Go; Owen et al., 1999Go; Postle et al., 1999Go; Rypma et al., 2002Go); and (ii) a critical role of DLPFC in both maintenance and manipulation (Manoach et al., 1997Go; Jha and McCarthy, 2000Go; Zarahn et al., 2000Go; Veltman et al., 2003Go). Possible explanations for the failure to have reached a consensus are the theoretical differences in information provided by lesion and neuroimaging studies, the perhaps underdetermined nature of the constructs of ‘maintenance’ and ‘manipulation’, and the seeming high inter-study variability in both the lesion and neuroimaging literatures.

WM Load Effects in Hippocampus

A second WM load-related component was detected whose strongest expression was in hippocampus. The hippocampus and other medial temporal lobe (MTL) structures seem to be essential for the encoding of new information into LTM. In particular, both the hippocampus (Alvarez et al., 1995Go) and the perirhinal/parahippocampal cortex (Zola-Morgan et al., 1989Go) appear to be necessary for the formation of new LTM traces. In contrast, there is evidence that in humans and non-human primates, these MTL structures are not always necessary for remembering information on the time scale of seconds, i.e. are not necessary for WM maintenance (Sidman et al., 1968Go; Wickelgren, 1968Go; Zola-Morgan and Squire, 1985Go; Cave and Squire, 1992Go; Alvarez et al., 1994Go; Leonard et al., 1995Go; Mayes et al., 2002Go). However, there is controversy over this point as other data (Holdstock et al., 1995Go; Owen et al., 1995Go; Buffalo et al., 1998Go; Squire et al., 1988Go; Baxter and Murray, 2001Go) or analysis approaches (Ringo, 1991Go) are suggestive of a necessary role of MTL in WM maintenance. In terms of neural engagement of hippocampus during WM maintenance, most tests of delay period fMRI activity have not yielded positive results in hippocampus or other MTL structures (see e.g. Cabeza et al., 2002Go, table 1). However, an influential study by Ranganath and D'Esposito (2001)Go demonstrated sustained neurophysiological activity in anterior hippocampus during the delay period of a WM task for novel faces. The amplitude of this delay period activity was modulated by the novelty of the face stimuli, leading the authors to hypothesize that the hippocampus is involved in WM processing for novel stimuli.

In the current study, the critical property of the WM load-related component expressed strongly in MTL is that it is orthogonal to WM load. That is, delay period fMRI signal in these regions did not remain constant across WM loads. Instead, delay period MTL signal manifested a pronounced inverted U-shape with respect to WM load. While a region displaying a constant level of neurophysiologic activity across WM loads could still ostensibly be consistent with a critical role in WM processing (i.e. a role that is simply independent of WM load), a pattern of delay period signal that is dependent on, yet orthogonal to, WM load would make a role in WM maintenance less plausible. However, one alternative explanation is that, counter to our logic, WM load was itself orthogonal to set size (instead of being linearly related to it). That is, the amount of information maintained in WM would, for some reason, increase from set size 1 to set size 3, but then decrease from set size 3 to set size 6. For this to be true, our basic understanding of the amount of information being held in WM must be fundamentally in error. For example, one could conceive, albeit with difficulty, of an information chunking process such that a set size of 6 and a set size of 1 lead to the same WM load, while set size 3 is not chunked as efficiently as either set sizes 1 or 6; but this would be inimical to the extant data relating set size and WM capacity (Cowan, 2001Go). Nevertheless, if WM load was indeed orthogonal to set size, our data would be consistent for a role of hippocampus in WM. Such a premise would also complicate the interpretation of the first component, whose expression was linear with set size.

Another alternative explanation is that a qualitatively different cognitive process mediates delayed item recognition performance for low (say, up to set size 3) and higher set sizes, and perhaps hippocampus is involved in only the low load mechanism. This would imply the invalidity of a unitary phonological WM maintenance process, and seemingly require a paradigm shift. We are not aware of psychological data that would support such a qualitative difference in delayed item recognition processing at set size 3 and set size 6 when rehearsal is allowed (Sternberg, 1966Go; Cowan, 2001Go).

If one entertains the premise that the hippocampus is not involved in WM maintenance, one should speculate as to why there would be any set-size dependence in an area not related to WM per se. One hypothesis in this regard is that WM-irrelevant encoding of information into LTM occurred in this task context as a joint function of available attentional resources (i.e. what remains of attentional resources after the appropriate allocation of said resources to the primary task of rehearsal) and the amount of information available to encode. Again, the critical idea here is that this encoding would be irrelevant to delayed item recognition performance. Perhaps this function peaked in this delayed item recognition task at a set size close to 3 (with set size 1 having too little information to encode and set size 6 having a deficiency in available attentional resources after allocation of the necessary attention to the primary task of rehearsal). This type of model is testable by having subjects perform a delayed item recognition task with trial unique stimuli. Subjects can then be subsequently tested for recognition memory for these stimuli. This hypothesis would predict that the d' for recognition would be lower for stimuli presented in the highest set size condition (even though delayed item recognition accuracy would be expected to remain constant across these set sizes). Letter stimuli are not suited for this type of experiment due to their small number (which would unreasonably limit the number of trials that could be administered). A conceivable alternative stimulus class might be pictures of common objects. The concomitant neuroimaging finding would have to be tested using the new stimuli, as the relationship might be expect