Cerebral Cortex, Vol. 9, No. 1, 4-19,
January 1999
© 1999 Oxford University Press
On the Neural Correlates of Visual Perception
Department of Neurology, University of Massachusetts Medical Center, Worcester, MA 01655, USA
| Abstract |
|---|
|
|
|---|
Neurological findings suggest that the human striate cortex (V1) is an indispensable component of a neural substratum subserving static achromatic form perception in its own right and not simply as a central distributor of retinally derived information to extrastriate visual areas. This view is further supported by physiological evidence in primates that the finest-grained conjoined representation of spatial detail and retinotopic localization that underlies phenomenal visual experience for local brightness discriminations is selectively represented at cortical levels by the activity of certain neurons in V1. However, at first glance, support for these ideas would appear to be undermined by incontrovertible neurological evidence (visual hemineglect and the simultanagnosias) and recent psychophysical results on `crowding' that confirm that activation of neurons in V1 may, at times, be insufficient to generate a percept. Moreover, a recent proposal suggests that neural correlates of visual awareness must project directly to those in executive space, thus automatically excluding V1 from a related perceptual space because V1 lacks such direct projections. Both sets of concerns are, however, resolved within the context of adaptive resonance theories. Recursive loops, linking the dorsal lateral geniculate nucleus (LGN) through successive cortical visual areas to the temporal lobe by means of a series of ascending and descending pathways, provide a neuronal substratum at each level within a modular framework for mutually consistent descriptions of sensory data. At steady state, such networks obviate the necessity that neural correlates of visual experience project directly to those in executive space because a neural phenomenal perceptual space subserving form vision is continuously updated by information from an object recognition space equivalent to that destined to reach executive space. Within this framework, activity in V1 may engender percepts that accompany figureground segregations only when dynamic incongruities are resolved both within and between ascending and descending streams. Synchronous neuronal activity on a short timescale within and across cortical areas, proposed and sometimes observed as perceptual correlates, may also serve as a marker that a steady state has been achieved, which, in turn, may be a requirement for the longer time constants that accompany the emergence and stability of perceptual states compared to the faster dynamics of adapting networks and the still faster dynamics of individual action potentials. Finally, the same consensus of neuronal activity across ascending and descending pathways linking multiple cortical areas that in anatomic sequence subserve phenomenal visual experiences and object recognition may underlie the normal unity of conscious experience.
| Introduction |
|---|
|
|
|---|
It may now be possible to discern the beginnings of a unified framework to delimit the neural correlates of at least one aspect of conscious vision. Tentatively, it appears helpful to subdivide the entirety of conscious vision into at least four components. Perhaps the most basic is that of phenomenal visual experience or `phenomenal consciousness' as defined by Block (1995). Phenomenal qualities such as the raw sensations of brightness and color are sometimes referred to as `qualia'. The neural correlates of such phenomenal visual experience may be considered to comprise a phenomenal perceptual space (PPS). Within this space, object localization is retinotopic and thus relative to the direct line of sight (Holmes, 1945
Between perceptual space and executive space, it may be convenient to assume the existence of an object recognition space within which we can surmise the existence of neural representations that can uniquely specify an object and serve as concepts and subsequently as working memories for further analysis within executive space. Later on, we may enquire as to whether such representations of concepts within object recognition space are purely symbolic or should be included within a phenomenal perceptual space. Finally, extra-personal spaces (Grüsser and Landis, 1991
), i.e. mappings of objects in or locations of the external world in head-centered (Andersen et al., 1985
) or body-centered spatiotopic coordinate systems have long been recognized. Moreover, `allocentric' coordinate systems provide mappings of the world independent of our actual percepts and spatial position (Grüsser and Landis, 1991
). These representations, largely initially mediated by specialized regions within the parietal lobe (Critchley, 1953
), subserve absolute location of objects in space (Holmes, 1945
; Galletti et al., 1995
), mental imagery of spatial mappings (Grüsser and Landis, 1991
), abstract representations of space that can be used to guide movements (Andersen et al., 1985
, 1997
; Milner and Goodale, 1995
), spatially referent binding of color and motion (Friedman-Hill et al., 1995
) and selective attention (Milner and Goodale, 1995
). Some of these representations may modify but do not give rise independently to phenomenal visual experience.
I shall largely confine the present analysis to phenomenal visual experience; even here further restrictions to simplify the problem are helpful. Thus, I shall focus largely on static achromatic visual experience, leaving aside the equally important subjects of the experience of color and motion. I recognize that attempting to formulate a tentative but coherent framework for consideration of the neural correlates of even one type of phenomenal visual experience is hazardous. Nevertheless, I believe that we have reached a point based on the cumulative experience of clinical neurology, the basic neurosciences and advances in neural network theory over the past decade that a unified framework for consideration of such correlates is now possible and, at the very least, opens the way for improved models based on refutation or confirmation of some of the present set of proposals.
| The Neurology of Human Phenomenal Visual Experience |
|---|
|
|
|---|
Not long after Henschen (1893) established that the human primary visual receptive area corresponded to what we now call the striate cortex or V1, neurologists began to discover disorders of higher visual function from brain injuries beyond the striate cortex that nevertheless left basic visual experience intact. Before the middle of the 20th century, Holmes (1945) concluded that human primary visual perception, including discriminations based on brightness and color, was subserved by the striate cortex. Subsequent work proved that Holmes was mistaken in assigning color perception to the striate cortex. We know now that certain brain lesions beyond V1 and V2 can eliminate perception of color (the achromatopsias) (Damasio et al., 1980
Holmes and his contemporaries knew that humans experiencing alexia (Holmes, 1945
; Grüsser and Landis, 1991
) as a consequence of certain occipital lobe lesions may lose the ability to read and even identify letters even though they see well enough to copy them accurately. Similarly, patients experiencing associative visual agnosias (Rubens and Benson, 1971
; Damasio et al., 1982
; Grüsser and Landis, 1991
) as a consequence of lesions often at the occipitotemporal junction may lose the ability to identify complex objects even when they can copy them accurately and their language ability has remained intact. Moreover, there remains longstanding evidence in both primates (Gross, 1976
) and humans (Damasio, 1990
) that bilateral ablation or injury to inferotemporal cortex impairs certain visual discriminations and some higher-level recognitions but leaves phenomenal visual experience essentially intact. Similarly, as noted above, phenomenal visual experience survives extensive or diverse injuries to prefrontal cortical areas, though no single reported case has had a complete bilateral ablation of all of prefrontal cortex.
Although neurons in MT/V5 may process certain classes of motion (Barbur et al., 1993
), integrate depth and motion cues (Bradley et al., 1995
) and derive three-dimensional structure from motion (Bradley et al., 1998
), other cells in dorsal MST are involved in the analysis of optic flow (for review, see Andersen et al., 1996
). I leave these motion-induced experiences largely aside in order to focus on static achromatic form vision which has long been considered a function of the occipitotemporal or `ventral stream' (Mishkin et al., 1983
). Similarly, I will discuss only briefly the dorsal pathways within the parietal lobe that establishes multiple distinct spatial reference frames in parietal cortex that provide spatial representations for the guidance of selective actions with respect to the localization of visual targets (for review, see Colby, 1998
). Within the ventral stream, V1 projects principally to V2 and thence predominantly to V4 with subsequent projections from V4 directly and through TEO to the multiple inferotemporal cortical areas (IT) within the lateral temporal lobe (Felleman and Van Essen, 1991
) (see Fig. 1
).
|
Restricted lesions within V4 in primates (Schiller, 1993
Visual field defects for static achromatic stimuli in humans invariably occur after injury to the corresponding retinotopic representation within contralateral V1 (Holmes, 1945
). Field defects that strictly respect the horizontal meridian may occur after lesions of V2 that extend across the V2/V3 border (Horton and Hoyt, 1991
; McFadzean and Hadley, 1997
) which marks that meridian as well as after lesions within V1 (McFadzean and Hadley, 1997
) (see Fig. 2
). However, isolated lesions within V3 alone, sparing V2, or in any cortical area beyond V3 have not been reported to produce such field defects.
|
Based on the foregoing neurological results, V2 may seem to have an equal claim to that of V1 for subserving luminance-based visual experience. Indeed, although macaques with ablations of V1 are unable to detect sine-wave gratings above 12 c/d, they can detect the lower spatial frequencies, albeit only at very high contrasts (Miller et al., 1980
It might be argued that Merigan et al. (1993) did not exclude `blindsight', i.e. `visual capacity in the absence of acknowledged awareness' (Weiskrantz, 1995
), as an explanation for their results because they did not also employ the paradigm of Cowey and Stoerig (1995), who trained monkeys to discriminate between real world events and no-stimulus blanks, thus permitting these authors to determine whether the monkeys perceived stimuli or treated them as blanks. However, Merigan et al. (1993) placed lesions in V1 for comparison and these devastated vision. Moreover, the conditions for eliciting `blindsight' and the attendant stimulus parameters may be rather stringent (Weiskrantz, 1995
). Thus, it is much more likely, although not entirely certain, that the macaques with lesions in V2 as studied by Merigan et al. (1993) actually perceived the test stimuli. However, tasks involving complex spatial discriminations were impaired after lesions of V2.
Lesions within either V2 or V4 do, however, interfere with the ability of an animal to distinguish test stimuli, especially weak stimuli (Schiller, 1993
), embedded in a dense array of `competing' stimuli. Analogous results have been found in humans with lesions in extrastriate areas (Rizzo and Robin, 1990
). Thus, even though extrastriate cortices beyond V2 remain essential for the discrimination of complex patterns, the most elemental phenomenal experience and isolated visual discriminations based on brightness discriminations for stimuli of medium and fine-grained detail in higher primates as well as humans seem to depend on the structural integrity of the striate cortex as long as this area is not isolated from the rest of the cerebral cortex (Bodis-Wollner et al., 1977
).
The lateral geniculate nucleus (LGN), however, does not appear to be essential for phenomenal visual experience. For example, elemental visual experiences of punctate white or colored lights called `phosphenes' can be evoked in man by direct electrical stimulation of densely hemianopic striate cortex after severance of its connections to and from the LGN (Brindley and Lewin, 1968
; Dobelle and Miladejovsky, 1974
). Even so, these results do not necessarily exclude the LGN as a substratum for visual experience as opposed to simply a conveyor of information from retina to visual cortex under normal conditions. However, if we accept as I do the premise of Crick and Koch (1995a) that the brain must construct an explicit representation of any particular visual feature as a necessary condition before that feature can be perceived, then it appears less likely that the LGN is directly involved in visual experience. For example, the `narrow-band' representations for orientation and spatial frequency in the luminance domain that match the `channels' revealed psychophysically by adaptation studies (Blakemore and Campbell, 1968) are not found in the macaque LGN (Derrington and Lennie, 1984
) but are achieved at the level of the striate cortex. Moreover, explicit representations for color are computed well beyond the LGN and even beyond V1 (Damasio et al., 1980
; Zeki, 1990
) in an area in the fusiform gyrus variously identified as V4 (McKeefry and Zeki, 1997
) or V8 (Hadjikhani et al., 1998
). See also Zeki et al. (1998) and Tootell and Hadjikhani (1998). Thus, there is no obvious evidence for the construction of explicit representations for either form or color vision within the LGN.
However, evidence that V1 is indispensable for at least certain types of visual experience, apart from its role as a central distributor of retinally derived information to extrastriate cortices, derives from results that static achromatic visual experience and luminance-based form discrimination have remained intact despite a multiplicity of diverse lesions to extrastriate, parietal, temporal and frontal cortical areas, and the fact that the LGN is not essential for humans to experience phosphenes. This interpretation of the neurological and behavioral literature is essentially in agreement with the views of Damasio (1989) and Stoerig and Cowey (1995, 1997). Admittedly, such lesions beyond V1 and V2 are often incomplete and/or unilateral, but a finding of a visual field cut as a consequence of any cortical lesion more anterior than that studied by Horton and Hoyt (1991) would have been such an extraordinary finding that it could scarcely have escaped notice in the neurological literature. Moreover, for Stoerig and Cowey, the significance of `blindsight' in humans and monkeys after unilateral striate cortex injury or ablation is not simply that such subjects possess some evidence of residual visual processing in the form of pointing to a stimulus above chance level, but that phenomenal vision is absent. Thus, for these authors, the existence of blindsight provides further evidence that the striate cortex is indispensable for at least certain aspects of phenomenal vision. Physiological studies provide further support for these conclusions. However, the striate cortex does not appear to be indispensable for the phenomenal experience of certain types of motion (Blythe et al., 1987
; Ceccaldi et al., 1992
; Barbur et al., 1993
; Zeki and ffytche, 1998
).
The `Grain' Problem
We simultaneously perceive the finest detail and retinotopic localization of visual signals in the frontoparallel plane with great precision. That such experiences are conjoined is not trivial because vast territories of the cortical mantle are involved in the disparate tasks of identifying individual objects independently of size, position and location (Lashley, 1942
), whereas other regions are dedicated to localizing objects in various coordinate frames independently of object identity (Grüsser and Landis, 1991
).
The conjoined optimal localization of signals in both the two- dimensional spatial and spatial frequency domains (Daugman, 1985
) is best expressed by sets of phase-specific simple cells in V1 (Pollen and Ronner, 198l; Foster et al., 1983
). The subzones of the receptive fields of these cells are selectively sensitive to either increments or decrements of light (Hubel and Wiesel, 1962
) and spatial processing across such receptive fields is largely linear (Jacobson et al., 1993
). The two-dimensional joint optimalization for preferred orientation and spatial frequency in the frequency domain and for the x and y coordinates in the spatial domain follows from results that the largely linear receptive field line-weighting functions of these cells are well-described as Gaussian-attenuated sinusoids and cosinusoids (Marcelja, 1980
). The Gaussian weighting renders the signal as the most compact to specify jointly spatial frequency and space (Gabor, 1946
). The Fourier transform of these `Gabor functions' in the space domain yields an equally compact function in the spatial frequency domain (Gabor, 1946
; Marcelja, 1980
). The products of uncertainties within the two domains approaches a theoretical minimum (Marcelja, 1980
). Simple cells with corresponding properties, at least for analyses of brightness distributions within frontoparallel planes, are found within both V1 and V2 (Foster et al., 1985
), but not within V3A (Gaska et al., 1988
) nor apparently in V4 (Desimone and Schein, 1987
).
Thus, neuronal ensembles that subserve both fine spatial detail and spatial position together within the same cortical areas appear to be localized to V1 and V2. Both cortices also abound in complex cells, i.e. neurons that are insensitive to local sign for luminance (Hubel and Wiesel, 1962
), responding to either increments or decrements of light. The response properties of such neurons, which follow second-order statistics in V1 (Gaska, et al., 1994
), seem unsuited to convey precise information about the direction of brightness changes, and seem not to permit these neurons to interact linearly with other neurons across a cortical area to compute precise retinotopically specified brightness discriminations. Thus, the simple cells of V1 and V2 are more apt to subserve phenomenal vision for luminance discriminations than neurons that are non-selective to local sign.
There are also other neurons in V1, the non-orientation-selective `blob' cells (Livingstone and Hubel, 1982
), that are selective to local sign. However, these lack the orientation and spatial frequency selectivity required for humans to distinguish square-wave from sine-wave gratings only when the contrast of the third harmonic of the square-wave grating has reached its own independent threshold (Campbell and Robson, 1968
).
Thus, based solely on the above discussion, the simple cells of V1 and V2 might lay equal claim for a privileged role in phenomenal vision based upon luminance discriminations. There are, however, certain differences in the properties of simple cells in the two cortical areas. Spatial frequencies of neurons in V1 are higher than those in V2 at the same retinal eccentricity by at least an octave (Foster et al., 1985
; Levitt et al., 1994
). Thus, the finest conjoined representation for both spatial detail and retinotopic localization appears to be subserved by sets of simple cells within but not beyond V1. Moreover, the spatiotemporal pattern of the most efficient human contrast detector corresponding to `what the eye sees best' well approximates the receptive field profiles of simple cells in V1 (Watson et al., 1983
). This result strengthens the case for identifying the ensemble of phase-specific simple cells within V1 as part of an explicit representation for the detection and perception of localized achromatic stimuli.
Conversely, conjoined explicit representations for fine detail over two-dimensional space and spatial frequency are partially decoupled during recoding beyond V1 and V2 and irrevocably so beyond V4 such that spatial information for object localization in specific coordinate systems projects largely to the parietal lobe, whereas two-dimensional spatial frequency data for object recognition projects largely to the temporal lobe (Mishkin et al., 1983
). For example, posterior parietal neurons, apart from showing some limited spatial summation properties with respect to target size and luminance, are remarkable for their lack of specificity for object shape (Robinson et al., 1978
). Some such cells encode locations dependent upon eye position in head-centered (Andersen et al., 1985
) or other (Colby, 1998
) coordinate spaces. There are neurons within area PO (V6) that can encode the absolute or `real position' of an attended object independent of direction of gaze (Galletti et al., 1996), and such neurons may provide motor areas with visuospatial information required for arm-reaching movements with respect to the location of a specific target. This computation does not appear to be confounded by the shape or intrinsic detail of the attended object. However, some sensitivity for simple two-dimensional geometric shape has been reported for neurons with typically large receptive fields in the lateral intraparietal cortex (Sereno and Maunsell, 1998
) presumably serving to facilitate the manipulation and grasping of objects (Logothetis, 1998
).
Moreover, the enormously large receptive fields of inferotemporal neurons do not appear to undertake an analysis of fine spatial position in addition to that of object recognition, though there are three important qualifications. First, such neurons show increased sensitivity over foveal and parafoveal regions (Gross, 1976
), and selective attention within a spatial window can differentially enhance signal to noise within the aperture of interest compared to that within ignored regions (Moran and Desimone, 1985
). Moreover, a small percentage of inferotemporal neurons show some sensitivity for encoding both object size and retinal location (Lueschow et al., 1994
). However, none of these mechanisms provide more than coarse localization for individual objects as opposed to the fine spatial representation across the entire visual panorama subserved by neurons in V1 and V2.
There are cross-connections between dorsal and ventral streams (Merigan and Maunsell, 1993
), and perhaps they coarsely bind identification and localization of attended objects. Even so, the cross-connections cannot restore information if it has been lost pursuant to generalization processes within the two streams. For example, suppose that those neurons in V6 that encode the `real position' of an object (Galletti et al., 1995
) could somehow convey such information to IT by cross-connections. Even if possible, such cells would likely be conveying only the central coordinates of the target but not that of its fine structure nor that of the panorama of the associated visual scene. Thus, at the cortical level, if we are to look for neurons that conjointly specify both fine spatial detail and precise retinotopic localization we must look to the simple cells in V1 and V2, and for those neurons that subserve the finest-grain conjoined representations we must, based on present information, look exclusively to the simple cells in V1.
Other evidence supports the role for these early cortices in phenomenal vision. For example, the representation of surfaces (Nakayama and Shimojo, 1992
), which also requires specification of both fine spatial detail and spatial position, can be instantiated prior to the evocation of selective attention (Mattingley et al., 1997
), likely placing such implementation prior to stages of object recognition, and thus at least in part within early visual cortical areas such as V1 and V2. However, although the case for a primary role of neurons in V1 in phenomenal visual experience appears quite strong based on both neurological and physiological results, there remains equally strong evidence that activation of neurons in V1 may not be sufficient to activate a visual percept.
Afferent Activation of Neurons in V1 May Not Generate a Visual Experience
Many patients with structural damage to the right parietal lobe fail to attend to complex visual stimuli in the left hemifield even when tests with individual stimuli show that visual fields are intact and that such patients are not hemianopic (Critchley, 1953
). Other such patients may identify a test object in the left hemifield when it is presented in isolation but not when a competing stimulus is simultaneously shown within the right hemifield (Critchley, 1953
).
Equally informative are case presentations of the simul tanagnosias (Critchley, 1953
; Rizzo and Robin, 1990
), wherein patients with lesions of extrastriate cortices may at any one instant see only fragmentary components of the visual field. Luria (1959) described a patient who could perceive a 3 x 2 array of points when asked to search for a rectangle but could experience only a single point when asked to count the dots. Rizzo and Robin (1990) explain simultanagnosia as the inability to sustain visuospatial attention simultaneously across all the elements in an array. Their clinical experiences are matched by behavioral studies in primates showing that animals with V1 intact but with lesions within V4 may identify individual stimuli very well but fail to make correct identifications when a particular stimulus is embedded within a complex array of competing stimuli (Schiller, 1993
; Merigan, 1996
). In these cases, there is every reason to suspect that the non-perceived visual stimuli have excited neurons within the striate cortex and that suppression has initially occurred at a higher level, although no direct test has yet been made. However, two recent psychophysical studies have provided incontrovertible evidence that neurons in V1 can be activated in the absence of a visual percept.
He et al. (1995) used laser interferometry to produce sinusoidal gratings of extremely high spatial frequency close to or just above the foveal resolution limit of 60 c/d (Campbell and Green, 1965
). When they presented an interference pattern at a slightly higher spatial frequency of 67 c/d, the subjets could no longer perceive the grating, although they could detect an orientation-specific loss of sensitivity at 48 c/d, indicating that their non-perceived test stimulus had activated orientation-selective neurons in the primary visual cortex before its trace was `subsequently obliterated by subsequent spatial filtering within the cortex' (S. He, personal communication, provided the specific test values).
In a second experiment, He et al. (1996) found that human observers can identify the orientation of a single small grating patch presented to the periphery of the superior visual field when the patch is viewed in isolation but not when the patch is flanked or `crowded' by similar patches. Orientation-specific adaptation was only minimally reduced by the crowding, suggesting that neurons in the first orientation-selective stage of necessity within V1 were still active. The authors interpret their results as implying that spatial resolution was limited by an attentional filter acting beyond the striate cortex to restrict the availability of visual information to conscious awareness.
In a sense, the second experiment of He et al. represents an example of `asimultanagnosia in normally sighted individuals' and together with the above-cited neurological studies suggest that neurons in V1 either are not directly involved in phenomenal visual experience or alternatively that something more than initial excitation of neurons in V1 by afferent activity may be necessary to produce a visual percept. The first alternative is preferred by Crick and Koch (1998), who have suggested other reasons to doubt the involvement of V1 in any kind of visual awareness.
The CrickKoch Conjecture
Crick and Koch (1995a) postulated that only those visual areas that project directly to anterior or `frontal' brain regions that `contemplate, plan and execute voluntary motor outputs' can participate directly in visual awareness. Their conjecture is based initially on an unchallenged assumption that `in going from one visual area to another further up in the visual hierarchy... the information is recoded at each step'. However, they further assume that such recoding, which effectively isolates those neurons that participate in early cortical visual processing from direct access to executive space, automatically precludes these same neurons from participation in visual awareness. In their view, it then follows that we cannot be directly aware of activity in our striate or primary visual cortex (V1) because this region does not project directly to frontal areas.
Crick and Koch also proposed that explicit representations of visual features, coarse-coded neural representations that correlate with percepts or objects, are a necessary but not sufficient condition for visual experience. I find no reason to disagree with this premise. However, within their model there is either an inference that explicit neuronal representations do not exist within V1 because their content would be altered during recoding beyond V1 prior to their projection to planning stages, or a conviction that even if explicit representations exist in V1 that we are unaware of them because of the absence of projections from V1 to planning stages. In any case, an absence of explicit representations within V1 would argue against a direct role for this cortex in visual perception.
Elsewhere, I cited evidence that at least some explicit representations are achieved in V1, and briefly noted that the involvement of the striate cortex in static achromatic form vision could not be excluded based upon neurological experience in brain damaged subjects (Pollen, 1995
). Crick and Koch (1995b) responded to my critique, but several areas of disagreement have remained outstanding. Moreover, hitherto I had not proposed a comprehensive plausible alternative to their conjecture that neurons within a phenomenal perceptual space must project directly to those within executive space.
Can we reconcile the large body of neurological and physiological evidence suggesting that the striate cortex is indispensable for at least one type of phenomenal visual experience with the equally impressive evidence suggesting that excitation of neurons within V1 by afferent activity may at times be insufficient to generate a visual percept? A common solution to this dilemma and a counter-argument to the CrickKoch conjecture may be possible within a framework that arose from early insights of Locke (1690/1976) and Helmholtz (1860/1962).
| Origins of Adaptive Resonance Theories |
|---|
|
|
|---|
Locke (1690) surmised that `our mind should often change the idea of its sensation into that of its judgment, and make one serve only to excite the other, without our taking notice of it'. Helmholtz (1860) agreed and emphasized that `it may often be rather hard to say how much of our perceptions (Anschauungen) as derived by the sense of sight is due directly to sensation, and how much of them, on the other hand, is due to experience and training'. He used the term `Vorstellung' or idea `to mean the image of visual objects as retained in the memory, without being accompanied by any present sense-impressions', and the term `Perzeption' or immediate perception to denote an awareness in which `there is no element whatever that is not the result of direct sensation'. For the vast majority of perceptual experience involving spatial structure, he assumed a meld in which idea and immediate perception are combined in different proportions. For Helmholtz, it was `the unconscious processes of association of ideas going on in the dark background of our memory' functioning as `inductive conclusions unconsciously formed' that played upon sense data to produce actual visual experience.
Counterviews have at times prevailed. Skinner (1957) adhered to a rigid `bottom-up' conditioned reflex approach to behavior which Chomsky (1959) assailed in his review of Skinner's book, concluding that `we must attribute an over-whelming influence on actual behavior to ill-defined factors of attention, set, volition, and caprice'.
Citing Chomsky's review and strongly influenced by Cybernetics (Wiener, 1948
), Miller et al. (1960) proposed that recursive or feedback loops were the fundamental unit of neuronal activity: such loops allow sensory inputs to be compared against some criteria established within the nervous system and are set up either to match the input to a template or alternatively to recognize incongruities between the two in which case the network would continue to respond recursively until the incongruity vanished. Subsequently, Pribram (1974), now well aware of both the cortico-cortical back-projections (Kuypers et al., 1965
; Pandya and Kuypers, 1969
) and the by then well-documented corticofugal projections from V1 to the dorsal LGN (Guillery, 1966
; Jones and Powell, 1969
), envisioned a progressively differentiating self-organizing feedback loop from active templates (referred to as programmed filters or programmed tapes) within inferotemporal cortex projecting back to striate cortex and to subcortical structures including the LGN. Similarly, Milner (1974) proposed an iterative process for pattern recognition wherein the ascending and descending visual pathways leave mutually consistent trails of facilitated synapses in the complementary pathway.
In reformulating and extending Helmholtz's concepts in current terms, Grossberg (1980) surmised, as had Pribram and Milner, that `sensory data activate a feedback process wherein a learned template, or expectancy, deforms the sensory data until a consensus is reached between what the data "are" and what we "expect" them to be'. For Grossberg (1976) select groups of neurons in a series of visual areas can establish a steady-state adaptive resonance, or reverberation, between regions if their patterns match, and suppress the reverberation if their patterns do not match. Models based upon these ideas for the occipitotemporal pathways have been independently developed or extended by many others, notably Harth (1976, 1987), Edelman (1978, 1987), Carpenter and Grossberg (1987), Finkel and Edelman (1989), Fukushima (1986), Koch (1987), Deacon (1988), Damasio (1989, 1990, 1994), Rolls (1990), Okajima (1991), Mumford (1991, 1992), Humphrey (1992) and Ullman (1995). Grüsser and Landis (1991) have proposed an analogous model for the occipito-parietal pathways suggesting that a continuous updating of information between the retinotopic and spatiotopic coordinates is essential for the biologically relevant perception of extrapersonal space. Quantitative mathematical adaptive models now exist to analyze how brain networks can establish stable sensory and cognitive recognition codes in response to arbitrary sequences of input patterns, to resolve the `stabilityplasticity dilemma' so that the brain can keep old memories stable yet remain plastic enough for new learning, and to show how such models can account for a myriad of perceptual phenomena (for reviews, see Carpenter and Grossberg, 1992
; Grossberg, 1995
).
| Proposed Functions for Adaptive Resonant Loops |
|---|
|
|
|---|
Evolutionary pressures to develop such feedback loops have probably been based in part upon the need of an organism to discriminate and interpret sensory data on the basis of its past experience and motivational state (Pandya and Yeterian, 1995
Thus, such loops have been proposed to employ active use of higher-level knowledge to disambiguate lower-order percepts (Cavanagh, 1991
; Grossberg et al., 1994
; Mumford, 1994
; Lee et al., 1998
) to mediate the play of selective attention upon early image representations (Milner, 1974
; Fukushima, 1986
; Koch, 1987
; Gove et al., 1995
), to correlate and synchronize the activity of interrelated groups thereby facilitating the continual updating of the perceptual image (Edelman, 1978
; Grossberg and Somers, 1991
), to permit parallel exploration and selection of multiple alternatives (Mumford, 1992
; Carpenter and Grossberg, 1993
; Ullman, 1995
), to facilitate binocular fusion by suppression of non-corresponding retinal images in the LGN (Singer, 1977
) to provide spatial `shifter circuits' for the computation of fine stereo vision and disparity hyperacuity (Anderson and Van Essen, 1987
; Mumford, 1994
) to pre-attentively separate figure from ground (Okajima, 1991
; Mumford, 1994
; Lee et al., 1998
), to modulate cortical output across cortical areas (Sandell and Schiller, 1982
), to sustain `temporal buffering' when there must be integration of clues otherwise `hidden' over immediately preceding and succeeding spatial or temporal events (Mumford, 1992
; Ullman, 1995
), and to mediate control of contrast gain of LGN neurons (Grossberg, 1980
; Koch, 1987
).
These functions are not, in general, mutually exclusive because all imply that the expectancy facilitates activity evoked by sensory afferents within the classical receptive field of target cells and/or a suppresses activity by irrelevant features in the surround. Despite some unresolved issues as to how residua (Mumford, 1992
), i.e. incongruities or mismatches between learned expectations and sensory inputs are handled, all theorists share an implicit and usually explicit view that once resonance is achieved, activity in the principal ascending and descending loops represents successive transforms that become largely complementary within each successive stage once resonance has been achieved.
With resonance established, activity at corresponding levels of the afferent and efferent pathways within each cortical area is roughly complementary in the sense that sensory input conveyed in ascending, largely supragranular, pathways matches that of expectations conveyed in descending pathways of largely infragranular origin. Thus, activity represented in `higher', i.e. `anterior', cortices bears an unique relationship to activity in `lower', i.e. `posterior', cortices.
The type of unique relationship suggested is not that of a one-to-one arrangement between neurons at lower and higher levels. Rather, the relationship is many-to-one or convergent in the feed-forward pathways to allow for generalization and abstraction as, for example, carried out by inferotemporal neurons (Gross, 1976
). Conversely, the back-projecting pathways support a one-to-many or divergent arrangement so that the neural representations of generalizations may be projected back to lower levels to search for matches over multiple apertures simultaneously consistent with anatomical and neurological evidence that will be presented later. Thus, neurons in posterior cortices may be continuously updated as to output directly from recognition space. Reports to executive space may emanate from anterior levels of the resonant loop without depriving neurons in early visual cortices of copies of roughly corresponding content. Moreover, executive space may indirectly selectively attend to targets within phenomenal perceptual space by means of projections back to object recognition space, which, in turn, may modify search strategies over phenomenal perceptual space.
The concept of bidirectional and complementary transforms is implicit in the concept of adaptive resonant loops but became especially explicit in the models of Okajima (1991) and Ullman (1995). Koch (1987), in proposing a role for the corticogeniculate projection system in selective attention and gain control, recognized the consequences of strong complementarity; `In the more radical version of this theory, the entire input to striate cortex from the thalamus would be limited to those locations where retinal and cortical-thalamic inputs coincide.' Such a model for strong complementarity between LGN and V1 is not difficult to envision given the tight anatomical coupling in both directions between neurons in the LGN and in layer 6 of V1 (Lund et al., 1975
). However, any attempt to maintain that there is reasonable complementarity between ascending and descending pathways along the cortico-cortical loops requires a brief account of pertinent results from anatomical, physiological, psychophysical and functional brain imaging studies.
| Anatomical Studies of the Back-projecting Pathways |
|---|
|
|
|---|
Pandya and Yeterian (1985) have reviewed the system of reciprocal projections originating from limbic structures, proceeding through the proisocortex in anterior temporal lobe back through inferotemporal cortices (IT) and thence back serially through a succession of extrastriate visual areas to striate cortex (V1) and the LGN. For present purposes it is sufficient to summarize the origins of the back-projections from V1 to LGN and from V2 to V1 because serial projections feeding back from still high areas follow similar general principles. Moreover, V3 and V4 (Rockland and Van Hoesen, 1994
Distinct sublaminae of layer 6 in V1 of the primate project back to parvocellular, magnocellular, and perhaps to intralaminar neurons within the dorsal LGN, thereby complementing direct ascending pathways to these same sublaminae (Lund et al., 1975
; Fitzpatrick et al., 1994
). The predominant back-projecting pathway from V2 to V1 originates in layer 6 and the bottom-most tier of layer 5 of V2 and projects in a bifurcating manner to supragranular and infragranular laminae in V1 bypassing the input layers (Rockland and Pandya, 1979
; Rockland and Virga, 1987).
The infra- to infragranular terminals end on dendrites relatively close to cell bodies, whereas the back-projections to supragranular layers are especially divergent and terminate on distal dendrites in layers 1 and 2 and inconstantly in layer 3 (Rockland and Virga, 1989
). The pattern of terminations might seem to suggest that the infra- to infragranular projections are stronger and may excite neurons, whereas the more distant connections on superficial dendrites might seem to suggest a milder modulatory role. However, the vastly greater number of distal terminals may compensate for this more remote location and it remains possible that many such distal terminals involve active dendrites conductances (Cauller and Connors, 1994
; Hoffman et al., 1997
). A second back-projecting pathway from V2 to V1 comprises <10% of the total of back-projecting neurons (Rockland and Virga, 1989
). This pathway emanates from layer 3A of V2 and has a similar pattern of terminations in V1.
| Physiological Studies of Back-projecting Pathways |
|---|
|
|
|---|
Physiological studies of LGN neurons have demonstrated various examples of cortically mediated binocular suppression when the extended surrounds beyond the classical LGN receptive field are stimulated (Schmeliau and Singer, 1977
Much less is known about center-to-center responses than about suppressive effects from the surround. Early studies showed direct excitatory projections from V1 to neurons in the LGN when their cell bodies are in topographic registration (Schmielau and Singer, 1977
; Tsumoto et al., 1978
; Ahlsen et al., 1982
). However, other studies that attempted to demonstrate such effects in other paradigms have given variable and inconstant results, raising doubt as to how the corticofugal excitatory influence is conveyed to LGN neurons (Baker and Malpeli, 1977
). More recently, however, stimulus-dependent synchronization of LGN neurons with non-overlapping receptive fields was demonstrated when bar stimuli set to jointly stimulate these fields over distances comparable to the preferred lengths of neurons in layer 6 of V1 were tested (Sillito et al., 1994
). Such synchronization can lead to enhanced spatiotemporal summation at a cortical level and thereby reinforce the LGN
V1
LGN loop (Sillito et al., 1994
).
Recently, we discovered a general principle mediating the corticofugal control of macaque LGN neurons (Przybyszewski et al., 1998
). The gain of the contrastresponse function of LGN neurons is substantially reduced by reversible inactivation of V1 in a manner that implies a robust role for a multiplicative or non-linear control of the contrast gain of LGN neurons by corticofugal projections under normal conditions. Thus, the activity of LGN neurons is generally substantially enhanced when their retinal inputs match the corticofugal output of the striate cortex. These effects apply to luminance processing by both magnocellular and parvocellular neurons and the processing of isoluminant chromatic stimuli by parvocellular neurons.
There are fewer direct demonstrations of the effect of backprojecting activity to V1. Sandell and Schiller (1982) found decreased activity of neurons in the infragranular layers of V1 following selective inactivation of V2 by cooling, implying the possibility of robust excitatory effects on V1 neurons under normal conditions. Payne et al. (1996) have shown that the response of the center of the receptive field of V1 neurons in the macaque to visual stimulation decreases during GABA-induced selective inactivation of the retinotopically corresponding region of V2, while the response to stimulation of the surround increases. Thus, the normal distinctions between stimulations of the center and surround are blurred during inactivation of V2. Payne et al. (1996) suggest that the main effect of feedback from V2 to V1 is to increase the selectivity of neurons in V1 for small stimuli activating the receptive field center, a conclusion in keeping with adaptive filtering and our own results on feedback from V1 to LGN (Przybyszewski et al., 1998
). Facilitory modulations of neurons in V1 and V2 during selective attention, which must necessarily be mediated by back-projecting pathways, have also been demonstrated (Motter, 1993
; Press et al., 1994
; Roelfsema et al., 1998
).
| Psychophysical, Functional Imaging and Neurological Studies of Visual Imagery |
|---|
|
|
|---|
Visualization of a previously viewed grating pattern at an appropriate distance from the target can lower the threshold for detecting a similar grating pattern within the target area (Ishai and Sagi, 1995
Whether such activation of early visual cortical areas, which is sometimes but not invariably found, is an essential (Kosslyn and Ochsner, 1994
) or perhaps incidental (Moscovitch et al., 1994; Roland and Gulyas, 1994
) component of visual imagery remains controversial. The neurological literature has not yet resolved the issue. For example, Goldenberg et al. (1995) described a patient with apparently preserved visual imagery despite severe but not complete damage to V1. The patient initially appeared blind after bilateral posterior cerebral artery occlusions. However, sub-sequent MRI examination demonstrated islands of intact cortex at the occipital tip of the upper left calcarine lip, and the patient eventually recovered her sight within the central 5° of the right inferior quadrant. Thus, Goldenberg et al. (1995) conclude that `our case can neither confirm nor ultimately discard the possibility that the preservation of at least small islands of primary visual cortex is necessary for the preservation of visual imagery'. In any case, for present purposes the next key issue is to try to determine how the neural correlates of visual imagery may differ from those of a number of infrequent states in which phenomenal visual experience exists in the absence of concurrent retinal stimulation.
| Phenomenal Visual Experience in Hemianopic Fields |
|---|
|
|
|---|
Palinopsia (Critchley, 1953
Colored patterns or isolated bright or colored elemental spots of light or `phosphenes' occur as perceptual correlates of irritative phenomena and may be experienced by subjects within hemianopic fields (Kölmel, 1984
). Evidence suggests that the irritative stimulus originates in prestriate areas but that the phenomenal experience depends upon t

