Cerebral Cortex, Vol. 12, No. 8, 818-830,
August 2002
© 2002 Oxford University Press
Machine Psychology: Autonomous Behavior, Perceptual Categorization and Conditioning in a Brain-based Device
W.M. Keck Laboratory, The Neurosciences Institute, 10640 John Jay Hopkins Drive, San Diego, CA 92121, USA
Jeffrey L. Krichmar, W.M. Keck Laboratory, The Neurosciences Institute, 10640 John Jay Hopkins Drive, San Diego, CA 92121, USA. Email: krichmar{at}nsi.edu.
| Abstract |
|---|
|
|
|---|
In studying brain activity during the behavior of living animals, it is not possible simultaneously to analyze all levels of control from molecular events to motor responses. To provide insights into how levels of control interact, we have carried out synthetic neural modeling using a brain-based real-world device. We describe here the design and performance of such a device, designated Darwin VII, which is guided by computer-simulated analogues of cortical and subcortical structures. All levels of Darwin VIIs neural architecture can be examined simultaneously as the device behaves in a real environment. Analysis of its neural activity during perceptual categorization and conditioned behavior suggests neural mechanisms for invariant object recognition, experience-dependent perceptual categorization, first-order and second-order conditioning, and the effects of different learning rates on responses to appetitive and aversive events. While individual Darwin VII exemplars developed similar categorical responses that depended on exploration of the environment and sensorimotor adaptation, each showed highly individual patterns of changes in synaptic strengths. By allowing exhaustive analysis and manipulation of neuroanatomy and large-scale neural dynamics, such brain-based devices provide valuable heuristics for understanding cortical interactions. These devices also provide the groundwork for the development of intelligent machines that follow neurobiological rather than computational principles in their construction.
| Introduction |
|---|
|
|
|---|
A central goal of research in the neurosciences is to understand the relationships between brain structure, function and behavior. Several related factors make this a challenging task. One is the sheer complexity of neuroanatomical networks overlain by the physiological subtleties of brain dynamics. Another is the number of levels of control ranging from molecular events to perception, memory and the coordination of movement. Each level requires the analysis of a number of simultaneous causal factors and chains acting in parallel. The environment itself and interactions between an organism and its econiche add further complexity.
In dealing with this degree of complexity, careful experimental analysis and theory building are obviously essential. However, analytic approaches conducted separately at each level are unlikely alone to provide a full picture of neural patterns in a behaving organism. There are obvious limits on the number of levels simultaneously observable during any given experiment. Moreover, despite the power of mathematical and computational approaches, they have not yet provided a multilevel picture of the non-linear relationships between brain and behavioral events.
To confront these issues and complement these approaches, we have adopted a procedure called synthetic neural modeling (Reeke et al., 1990
; Edelman et al., 1992
). This consists of building devices capable of behavior, providing them with a computationally simulated nervous system based on known biological principles of neuroanatomical organization and physiological activity, and then following the behavioral and neuronal responses of such a construct in real time, in a real-world environment. By following behavioral and brain responses completely at all levels of control in a particular environment, one can formulate a synthetic picture that has heuristic value in interpreting data obtained from behaving animals.
A series of such brain-based devices capable of increasingly sophisticated autonomous performance has been tested over the last decade (Edelman et al., 1992
; Almassy et al., 1998
; Krichmar et al., 2000
; Sporns et al., 2000
). In these earlier devices, we demonstrated the learning of perceptual responses and emphasized the role of value systems. Value systems are neural structures that are necessary for an organism to modify its behavior based on the salience or value of an environmental cue (Friston et al., 1994
). The value system in a brain-based device is analogous to ascending neuromodulatory systems in that its units show uniform phasic responses when activated and its output acts diffusely over multiple pathways by modulating synaptic change (Schultz et al., 1997
; Sporns et al., 2000
).
In the present report, we describe the construction and performance of Darwin VII, a device capable of perceptual categorization and conditioned behavior. We have extended previous conditioning experiments to include second-order conditioning and have carried out an extensive analysis of the responses of simulated neuronal units. By probing simultaneous brain and behavioral responses at all levels during perceptual and conditioning tasks, we have obtained several new insights into the organization of autonomous behavior. These include a richer understanding of the effects of individual history on learning, of the possible origins of invariant object recognition in an analogue of the inferotemporal cortex, and of the relation of changes in synaptic efficacy to appetitive and aversive conditioned responses.
| Materials and Methods |
|---|
|
|
|---|
We have developed a heuristic in which a neurally organized mobile adaptive device (NOMAD) explores its environment and through experience-dependent learning develops adaptive behaviors. NOMAD is a part of the Darwin series of automata in which theories of the nervous system are tested by implementing brain-based devices (Reeke et al., 1990
|
Darwin VIIs behavior was guided by a nervous system simulated on a computer workstation (see Appendix, part A). The simulation was based on the anatomy and physiology of vertebrate nervous systems but obviously with fewer neurons and simpler architecture. The simulated nervous system was made up of a number of areas labeled according to the analogous cortical and subcortical brain regions. Each area contains different types of neuronal units consisting of simulated local populations of neurons or neuronal groups (Edelman, 1987
In the present experiments, the simulated nervous system contained 18 neuronal areas, 19 556 neuronal units, and ~450 000 synaptic connections. Figure 2
shows a high-level diagram of the different neural areas and the synaptic connections between neural areas in the simulated nervous system. Further details of the parameters describing neuronal unit activity and neuronal unit connectivity can be found in the Appendix (see Tables A1 and A2![]()
, and Appendix, parts B and C). Each simulation cycle took ~200 ms of real time. A simulation cycle is the period during which the current sensory input is processed, the activities of all neuronal units are computed, the connection strengths of all plastic connections are computed, and motor output is generated (see Appendix, parts A and B). Connections between and within neuronal areas were subject to activity-dependent modification following a value-independent (see Appendix, part C) or value-dependent (see Appendix, part D) synaptic rule. Synaptic modification was determined by both pre- and post-synaptic activity and resulted in either strengthening or weakening of the synaptic efficacy between two neuronal units. We used a modified Bienenstock, Cooper and Munro (BCM) learning rule to govern synaptic change because it has a region in which weakly correlated inputs are depressed and strongly correlated inputs are potentiated (Bienenstock et al., 1982
). Simplifying the BCM learning rule by making it piecewise linear and fixing the thresholds, resulted in an efficient biologically based learning rule (see Appendix, part C). Plastic connections that were value-dependent were made between areas involved in responses to salient environmental events [A1/IT
Mapp/Mave, A1/IT
S; see (Aston-Jones and Bloom, 1981
; Ljungberg et al., 1992
)]. Plastic connections that were not value-dependent were made between areas where perceptual categories were to be learned from experience (VAP
IT, LCoch/RCoch
A1). Non-plastic connections were between neural areas where there were reflex responses (Tapp/Tave
Mapp/Mave, R
C), local projections within an area (IT
IT, A1
A1), or between areas where it was assumed that plasticity had already occurred very early in development [R
VAP, see Crair et al. (Crair et al., 1998
)]. On the assumption that these synaptic changes do not saturate or persist indefinitely, we used a passive synaptic decay term (see
in Appendix, part C) to express a decline in synaptic strength in the absence of activity. Activation of the simulated value system (area S, Fig. 2
) signaled the occurrence of salient sensory events and contributed to the modulation of connection strengths of all active synapses in the affected pathways (see value-dependent projections in Fig. 2
). For example, tasting a block picked up by Darwin VIIs gripper is a salient event affecting subsequent behavior that is reinforced or weakened through synaptic change. Area S is thus analogous to an ascending neuromodulatory value system (Schultz et al., 1997
; Sporns et al., 2000
).
|
|
|
Experimental evidence suggests that key parameters of neural plasticity may vary over the course of postnatal development (Kato et al., 1991
In the experiments in which individual variation was to be examined, each Darwin VII subject shared the same physical device, but had an instantiation in which the simulated nervous system was unique, as a result of different random initializations within the constraints given by Table A2
, in both the connectivity between individual neuronal units and the initial connection strengths between those units. Because the connectivity between neuronal units was constrained by a common set of projections, however, large-scale connectivity (i.e. projections between neural areas) was similar between subjects. Details of the neuro-anatomical constraint parameters for each synaptic projection, as well as parameters for the synaptic efficacy rules and the projection distributions, can be found in the Appendix (part C and Tables A1 and A2![]()
).
Darwin VIIs environment consisted of an enclosed area with black walls and a floor covered with opaque black plastic panels, on which we distributed stimulus blocks (6 cm metallic cubes) in various arrangements (Fig. 1
). The top surfaces of the blocks were covered with removable black and white patterns; the other surfaces of the blocks were featureless and black. All experiments reported in this paper were carried out with multiple exemplars of two basic designs: blobs (several white patches 23 cm in diameter) and stripes (width 0.6 cm, evenly spaced). Stripes on blocks in the gripper can be viewed in either horizontal or vertical orientations, yielding a total of three stimulus classes of visual patterns to be discriminated (blob, horizontal and vertical). A flashlight mounted on Darwin VII and aligned with its gripper caused the blocks, which contained a photodetector, to emit a beeping tone when Darwin VII was in the vicinity. The sides of the stimulus blocks were metallic and could be rendered either strongly conductive (good taste or appetitive) or weakly conductive (bad taste or aversive). Gripping of stimulus blocks activated the appropriate taste neuronal units (either area Tapp or area Tave) to a level sufficient to drive the motor areas above a behavioral threshold. In the experiments described in this paper, strongly conductive blocks with a striped pattern and a 3.9 kHz tone were chosen arbitrarily to be positive value exemplars, whereas weakly conductive blocks with a blob pattern and a 3.3 kHz tone represented negative value exemplars.
Basic modes of behavior built into Darwin VII included IR sensor-dependent obstacle avoidance, visual exploration, visual approach and tracking, gripping and tasting, and two main classes of innate behavioral reflex responses (appetitive and aversive). With the exception of obstacle avoidance, selection among the above behaviors was under control of the simulated nervous system. Appetitive and aversive responses were triggered when the difference in activity between motor areas Mapp and Mave exceeded a threshold (Fig. 2
). These responses could be activated by taste (the unconditioned stimulus, US, triggering an unconditioned response, UR) or by auditory or visual stimuli (the conditioned stimulus, CS, triggering a conditioned response, CR). Prior to conditioning, taste triggered the behavioral responses; after conditioning, either a visual pattern or an auditory pattern could evoke behavioral responses. Unconditioned appetitive and aversive behavioral responses consisted of prolonged gripping and tasting of a stimulus block, releasing the block, and then turning counterclockwise. Conditioned appetitive responses, which occurred when motor area activity exceeded the threshold before tasting, differed from unconditioned appetitive responses in that a clockwise turn was executed after tasting a block. In conditioned aversive responses, Darwin VII avoided a stimulus block by backing away without picking it up and then turning clockwise. Thus, during the conditioning experiments, in which many stimuli were encountered over an extended period of time, Darwin VII developed perceptual categories that modified its behavioral responses.
| Results |
|---|
|
|
|---|
We describe details of two sets of experiments that demonstrate the usefulness of synthetic brain-based devices in testing theories of the nervous system and in understanding how interactions of the nervous system, the body, and the environment affect behavior. The first set focused on visual perceptual categorization and invariance in cortical responses; the second investigated conditioning experiments involving multiple sensory modalities.
Perceptual Categorization
Perceptual categorization is the ability to discriminate and categorize sensory stimuli (Clark et al., 1988
; Kilgard and Merzenich, 1998
). Development of this ability is obviously necessary for learning and conditioning and, for this reason, was extensively explored in Darwin VII. In primates, the inferotemporal cortex is an area that is believed to be associated with visual object recognition (Tanaka, 1996
). In Darwin VII, activity in the simulated inferotemporal cortex, area IT (Fig. 2
), provided the basis for visual perceptual categorization. Initially, ITs responses to visual stimuli were weak and diffuse (see IT activity in Fig. 3A
). After approximately five stimulus encounters, activity-dependent plasticity between VAP and IT caused IT responses to the different stimuli to become strong, sharp and separable (see IT activity in Fig. 3B
). It is this strong, discriminative activity of neuronal groups within IT in response to visual stimuli as well as the appropriate behavioral response that we refer to as visual categorization in Darwin VII.
|
Invariant Object Recognition
In animals, perceptual categorization in the inferotemporal cortex is invariant with respect to differences in position, scale and rotation of an object (Tanaka et al., 1991
; Tovee et al., 1994
; Ito et al., 1995
; Rolls and Tovee, 1995
; Tanaka, 1996
). Such invariant object recognition has been difficult to achieve in computer vision systems (Mundy and Zisserman, 1992
; Mundy et al., 1992
; Shashua, 1993
; Weinshall, 1993
). In the present work, however, Darwin VIIs object recognition was observed to be invariant with respect to scale, position and rotation. Visual categorization of a stimulus occurred no matter where an object appeared in Darwin VIIs visual field, with the apparent size of the stimulus ranging from a maximum when the object was directly in front of Darwin VII (Fig. 3A
) to one-quarter of the maximum size when the object was distal to Darwin VII. Correct categorization of striped blocks in Darwin VIIs field of vision, when blocks were not in its gripper, occurred when the stripes on the blocks were rotated over a range of ±30° of a horizontal or vertical reference.
Invariant object recognition required continuous, time-varying sensory input while Darwin VII moved. Invariant responses developed as a result of competition among activity-dependent plastic connections between retinotopically mapped VAP and non-topographically mapped IT. The connections that were potentiated earliest were those with VAP receptive fields corresponding to regions near Darwin VIIs gripper, regions where IT responses to the neural stimulus were first sustained (Fig. 4
, top First Horizontal Striped Block). These connections had a competitive advantage; they received not only the earliest but also the longest exposure to the stimulus as a result of the time spent by the block in the gripper. The maintenance of discriminative, persistent patterns of neuronal groups in IT required sustained high activity resulting from strengthening of plastic connections with VAP neuronal units that received continuously varying images of the block. Upon each approach and withdrawal from the stimulus block, the number of potentiated connections increased, resulting in recruitment of neuronal units with receptive fields that responded to visual stimuli beyond Darwin VIIs gripper (Fig. 4
, top). An example of the resultant activity in VAP and IT during invariant object recognition is shown in the bottom two rows of Figure 4
. When the temporal sequence of the images leading to invariance was artificially shuffled (Almassy et al., 1998
), invariant object recognition did not occur. As further considered in the Discussion, the invariance arose mainly as a result of an initial strengthening of VAP to IT synapses that was reinforced and expanded by subsequent inputs from the stimulus block during Darwin VIIs movements.
|
Stimulus History and Individual Variation
Differences in an individuals perceptual history can have an effect on the organization and response of the nervous system. For example, more neurons in the monkey inferotemporal cortex respond to familiar than to unfamiliar stimuli (Kobatake et al., 1998
). Using Darwin VII, we performed experiments concerned with experience-dependent effects on categorization during the development of perceptual categories as well as after such development.
We first investigated the effect of variations in presentation frequency of each stimulus class on the development of neuronal unit responses. Darwin VII explored an environment that was partially segregated into two equal sized areas. One area mainly contained blocks with blobs and the other area mainly contained blocks with stripes. In each of 14 separate experiments, Darwin VII started with an identical simulated nervous system that had not sampled any stimuli. The number of neuronal units in IT responding to a given stimulus (whether blob, horizontal stripe or vertical stripe) increased selectively with an increase in the frequency of presentation of that stimulus class. Statistical significance was tested using, r, Pearsons product moment correlation. Stimulus presentation frequency was found to be positively correlated with patterned neural activity in IT that was individually characteristic for each of the visual stimulus classes (blob: r = 0.71, P < 0.005; horizontal: r = 0.75, P < 0.003; vertical: r = 0.61, P < 0.03). These findings are similar to the results of neuronal recordings in the monkey inferotemporal cortex in that more IT neurons responded to familiar than to unfamiliar objects in recognition tasks (Kobatake et al., 1998
).
In these experience-dependent responses, competitive and selective interactions among neuronal units from VAP to IT and within IT governed the changes in the number of those units that responded to a stimulus. An increase in neuronal group size reflected the activity-dependent changes in synaptic connections from neuronal units in VAP to neuronal units in IT, leading to increased activity in IT. Through intrinsic excitatory connections, this increased activity further recruited neighboring neuronal units in IT. The change in neuronal group size was competitive: a group specific to one stimulus could grow at the expense of another neuronal group associated with another stimulus (Clark et al., 1988
).
In the second set of experiments on experience-dependent perceptual categorization, we studied the effect of stimulus presentation frequency on neural mapping in IT after visual categories had already been developed. To reach this level of experience, Darwin VII sampled an equal proportion (eight each) of blocks in the three stimulus classes. Darwin VII then sampled either eight additional stimuli containing all three stimulus classes or eight additional stimuli containing any two out of the three stimulus classes. Thus, some stimuli were more frequently sampled than others.
In contrast to the previous experiments on early development, after extensive experience, the number of neuronal units in IT responding to more frequently sampled stimuli did not change significantly, suggesting that responses in IT had become saturated with respect to the familiar stimuli. However, in the experiments in which Darwin VII responded to the less frequently sampled stimulus, the number of IT neuronal units was significantly less than that in the controls (Table 1
). Two factors appear to be responsible for these results. First, the growth of the neuronal groups in IT was limited by intrinsic excitatory and inhibitory connections. Recurrent excitation caused the size of the groups to grow, but lateral inhibition kept that growth in check (see IT activity in Fig. 3B
). At a certain size, the different neuronal groups that were active in response to a visual stimulus competed with each other and their growth was halted. In essence, the memory for these perceptual categories is stable. Secondly, the decrease in neuronal units responding to an under-sampled stimulus was governed, in part, by the decay rate,
, in the activity-dependent synaptic efficacy rule (see Appendix, part C). This caused the efficacy of each synaptic connection that had not been recently updated to decay towards its original value. If, for example, the blob visual stimulus was not encountered for a protracted period of time, synaptic connections from VAPB to IT weakened and fewer IT neuronal units responded to that stimulus class. In essence, the perceptual category was forgotten.
|
In addition to the influences of environmental experience on perceptual categorization, there were noteworthy individual variations in neuronal response patterns related to behavioral differences. Seven Darwin VII subjects, each with nervous systems having different initial conditions in connectivity and connection strengths, were allowed to sample at least 10 aversive and 10 appetitive blocks. The IT activity patterns showed significant variations between subjects and between stimulus classes within the same subject (Fig. 5
|
Response Sampling
In contrast to the limited number of cells whose activity can be monitored in live animals, the design of Darwin VII allowed us to record all such activity in all neuronal units. Neurophysiologists often test whether limited samples from brain areas are robust predictors of responses to input stimuli (Bialek and Zee, 1990
; Theunissen and Miller, 1991
; Brown et al., 1998
). It was therefore of interest to investigate whether a sparse sampling of neural patterns in the simulated area IT would reliably predict the response to visual stimuli by Darwin VII. We allowed seven individually different Darwin VII subjects to sample at least 20 aversive and 20 appetitive blocks. For each Darwin VII subject, patterns of activity in IT during the development of visual categories (i.e. exposure to the first 10 aversive and appetitive exemplars) were compared with patterns of activity in IT after categorization (i.e. exposure to the last 10 aversive and appetitive exemplars (see Appendix, part G). The accuracy of classification based on IT activity improved with each stimulus exemplar to near perfect performance (Fig. 6
). Classifications remained accurate even when relatively small sub-populations (1% of the neuronal units) in IT were sampled; below this range, prediction failed. The relatively small proportion of neuronal units in IT sufficient to classify responses to a given stimulus is in accord with results in live animals, as seen for example, in the limited number of hippocampal neurons needed to reconstruct a rats position in space (Wilson and McNaughton, 1993
) or the limited number of motor cortical neurons needed to predict a monkeys hand position (Georgopoulos et al., 1986
).
|
Conditioning Experiments
In a series of conditioning experiments, Darwin VII was trained to associate the taste value of objects with their visual and auditory characteristics. Weakly conductive objects were assigned innate negative value (bad taste) and strongly conductive objects were assigned innate positive value (good taste). In accord with our prior and arbitrary assignments of block properties, Darwin VII, through experience-dependent learning, associated the blob visual pattern and 3.3 kHz beeping tone with negative value, and the striped visual pattern and 3.9 kHz beeping tone with positive value. Seven individually different Darwin VII subjects participated in the experiments, in which each subject encountered at least 10 appetitive and 10 aversive blocks. In experiments in which only visual stimuli were paired with taste, positive conditioned responses occurred in >70% of trials after encountering the sixth exemplar and in >90% after encountering the tenth exemplar. In auditory conditioning trials, conditioned responses occurred in over 80% of trials following exposure to the sixth exemplar. While performance improved with training, it never reached perfection and occasional mistakes were made. This unpredictability is a property of selectionist systems in general. These are systems consisting of a population of variant repertoires which can be differentially amplified, thus yielding responses to unpredicted or novel events. Such selection has been proposed as being a property of real nervous systems (Edelman, 1987
). The unpredictability of behavioral responses in Darwin VII coupled with the variability of a complex environment did not, however, prevent the device from learning after mistakes, from generalizing over sensory inputs, and even from dealing with novel situations.
Early during the conditioning trials, Darwin VII picked up and tasted blocks that led to either appetitive or aversive responses (see Fig. 3A
). During this period, it was the output of the taste neuronal units that activated the value system (S) and drove the motor neuronal units (Mapp and Mave) to cause a behavioral response. After conditioning, however, both the value system and the motor neuronal units were immediately activated upon the onset of ITs response to a visual pattern or A1s response to a tone. This shift following learning, from value system activity that was triggered in early trials by the unconditioned stimulus to value system activity triggered at the onset of the conditioned stimulus, is analogous to the shift in dopaminergic neuronal activity found in the primate ventral tegmental area after conditioning (Schultz et al., 1997
).
After associating visual patterns with taste, Darwin VII continued to pick up and taste stripe-patterned blocks, but avoided blob-patterned blocks (see Fig. 3B
). After associating auditory sounds with taste, Darwin VII continued to pick up the high frequency beeping blocks, but avoided the low frequency beeping blocks (see Fig. 3C
).
We extended the training paradigm by carrying out second-order conditioning experiments (Rescorla, 1980
). In the first stage of conditioning, a single conditioned stimulus (CS1; either the tone or the visual pattern) was paired with taste for ~10 encounters with each block type until learning was achieved. In the second stage of conditioning, the two conditioned stimuli (CS1 and CS2, tone and visual pattern) were paired together for ~10 encounters with each block type. After the second stage, Darwin VIIs performance was tested by presenting CS2 alone for 10 encounters of each block type. There were four possible behavioral responses for each stimulus encounter: (i) appetitive unconditioned response, (ii) appetitive conditioned response, (iii) aversive unconditioned response, and (iv) aversive conditioned response. When CS1 was visual and CS2 was auditory (see high tone and low tone on the left side of Fig. 7
), Darwin VII made the appropriate appetitive and aversive conditioned responses to auditory stimuli. However, when CS1 was auditory and CS2 was visual, Darwin VII responded incorrectly to visual aversive stimuli (see blobs on the left side of Fig. 7
). As we discuss later, this resulted from the fact that, with this sequence of conditioned stimuli in the aversive learning condition, the blocks were avoided before gripping, and thus taste reinforcement could not occur. Examination of the synaptic weights between area IT and the motor neuronal units in this case showed that the connection strengths from IT to Mapp were greater than from IT to Mave (Fig. 8A
). By altering the synaptic efficacy function (Fig. 8
inset), we were able to assure that aversive stimuli evoked a stronger learning response than appetitive stimuli (Fig. 8B
). This change led to more balanced synaptic weights and more appropriate behavioral responses (Fig. 7
, right side).
|
|
| Discussion |
|---|
|
|
|---|
Brains in organisms as complex as vertebrates differ in many ways from digital computers or Turing machines. Unlike computer inputs, signals from the environment are not unequivocally coded. Brains of different individuals within a species vary enormously in development, history and fine-scale physiology. Sensorimotor experience is also highly individual despite characteristic species behaviors. Moreover, perceptual categorizations and memories are not simple replicas of world input patterns. Instead, these products of higher brain functions adapt in a species-specific fashion to environmental change. Since many of the functions of individual brains result from complex dynamic interactions at a variety of levels, the elucidation of underlying mechanisms requires simultaneous measurements and sampling across these levels. In living animals, these are difficult to obtain and compare.
These considerations suggest that synthetic modeling of the kind described in this paper may be a useful strategy in attempts to understand higher brain functions. The behavior of Darwin VII shows that a synthetic brain-based device operating on biological principles and without pre-specified instructions can carry out perceptual categorization and conditioned responses. The successful performance of the device rests on the selectional modulation of its neuronal activity by behavior as well as on the existence of constraints provided by its value system. In both the perceptual categorization and conditioning experiments, the development of categorical responses required exploration of the environment and sensorimotor adaptation through specific and highly individual changes in connection strengths. We observed Darwin VIIs overall behavior while at the same time recording the state of every neuronal unit and synaptic connection in its simulated nervous system. By collecting these neuronal data, we were able to demonstrate the development of neuronal groups during categorization and recognition by individual subjects (Fig. 5
), to show that reliable classification of responses to visual stimuli could be based on the sampling of a small sub-population of neuronal units (Fig. 6
), and to relate learning responses to functional changes in synaptic efficacy (Figs 7 and 8![]()
).
Darwin VIIs nervous system has three features that are critical for understanding the mechanisms underlying perceptual categorization: (i) Connectivity from a topographically mapped primary area with transient activity to a non-topographically mapped higher area with more persistent activity. (ii) Sensory input that is continuous and temporally correlated with self-generated movement. (iii) Activity-dependent learning in which competitive mechanisms categorize sensory information and select for appropriate behavioral repertoires.
All of these features were necessary for Darwin VII to achieve invariant object recognition. Because a given visual stimulus spent more of the time in Darwin VIIs gripper, VAP neuronal units with receptive fields near the gripper were initially selected for and their corresponding connections to neuronal units in IT were potentiated (see Fig. 4
, First Horizontal Shaped Block). Once localized and patterned activity began in IT, it tended to sustain itself via local recurrent excitation combined with lateral inhibition. Continual overlapping input from VAP as Darwin VII moved toward and away from the stimulus block (Almassy et al., 1998
) led to the reinforcement of the specific pattern of changes in synaptic strength from the retinotopically mapped VAP neuronal units to the non-topographic populations of neuronal units in IT. By these means, the activity of VAP neuronal units that drove IT neuronal units into a stimulus-dependent pattern of activity expanded from those with receptive fields near Darwin VIIs gripper to those involving an almost complete coverage of the visual field (see Fig. 4
, Fifth Horizontal Striped Block). As a result, IT neuronal units were primed to respond to stimuli over a wider range of Darwin VIIs visual field. Invariant object recognition is thus a system property that emerges dynamically from competitive neuronal group interactions within and between neural areas. These interactions differ from those of other models in which images are typically static and invariant object recognition is achieved by arranging features to line up across multiple views (Mundy and Zisserman, 1992
; Mundy et al., 1992
; Shashua, 1993
; Weinshall, 1993
), by deriving a learning rule that utilizes the temporal trace of neural activity (Wallis and Rolls, 1997
; Rolls and Stringer, 2001
), or by placing the main responsibility for invariance on neuronal properties alone (Tovee et al., 1994
; Rolls and Tovee, 1995
).
One striking characteristic of Darwin VII observed under all circumstances was the individuality of the patterns displayed by each subjects neural responses even for repetitions of the same behavior (see Fig. 5
). This is consistent with the observation that adaptive behaviors tend to remain similar despite changes in context and variance in system properties resulting from multiple interactions across circuitry, plastic synaptic connections, fluctuating value systems, and variable object encounters. Thus, Darwin VII is structurally and behaviorally degenerate: different circuits and dynamics can yield similar behavior (Tononi et al., 1999
; Edelman and Gally, 2001
). The developmental experiments comparing responses to strongly biased samples of appetitive or aversive stimuli indicate, however, that even with identical starting architectures, changes in experiential sequences can have profound effects. While this has been documented phenomenologically with living organisms, the experiments reported here may suggest possible mechanisms underlying such epigenetic biases.
The ability to modify various levels of control in Darwin VII provides insights into the neural mechanisms of learning during conditioning. For example, when CS1 was an auditory cue and CS2 was a visual cue, our second-order conditioning experiments revealed an asymmetry that was initially unexpected: a predominance of appetitive conditioned responses over aversive responses that is analogous to the psychological phenomenon of overshadowing. Overshadowing occurs when an intense, salient stimulus gains control of responses at the expense of another less salient stimulus (Pavlov, 1927
; Staddon, 1983
). In the second-order conditioning experiments where the CS1 was an auditory cue and CS2 was a visual cue, behavior similar to that of overshadowing occurred in Darwin VII for two reasons. First, because of the simple tonotopic mapping in A1, responses to auditory stimuli were stronger and easier to categorize than visual stimuli. No overshadowing occurred when CS1 was visual and CS2 was auditory, since visual categories in IT and the appropriate behavioral response developed during primary conditioning when visual stimuli (CS1) were paired with taste (US). Secondly, during the second stage of conditioning when both CS1 and CS2 were present, responses to the reinforcement (i.e. taste) of appetitive stimuli overshadowed aversive learning. This is attributable to the fact that after aversive learning the blocks were avoided before gripping and therefore taste reinforcement did not take place. Thus, in this sequence, Darwin VII generalized incorrectly that all visual stimuli were predictive of positive value. In the appetitive learning condition, this avoidance did not occur and reinforcement came from the US (taste) and CS1 (auditory cue). Conditioning performance more consistent with animal models was obtained by altering the synaptic efficacy so that changes in plasticity were on average larger for aversive events than for appetitive events (see Figs 7 and 8![]()
). These results are consistent with the observation that the brain uses different learning rates for punishment and reward and that, in some cases, this difference may be crucial for an organisms survival (Garcia et al., 1955
; Siucinska et al., 1999
; ODoherty et al., 2001
).
The design of brain-based devices such as Darwin VII that possess neuroanatomical structure and large-scale neural dynamics differs fundamentally from that of robots. Unlike Darwin VII, robotic approaches using classical artificial intelligence are based on data representation, rule-driven algorithms, and the manipulation of formal symbol systems (Moravec, 1983
; Nilsson, 1984
). Artificial intelligence has been somewhat successful in emulating logical aspects of human behavior, but has been less successful in dealing with perception, categorization and movement in the world, which is a strength of synthetic neural models and brain-based devices (Reeke and Edelman, 1988
; Pfeifer and Scheier, 1997
). Purely reactive or behavior-based robots carry out actions that are controlled through arbitration of several primitive behavioral repertoires without neural architectures (Brooks, 1986
; Arkin, 1993
). Evolutionary robotics, in which control systems are selected after each trial or lifetime according to a fitness function (Nolfi and Floreano, 2000
), can evolve complex behaviors with very simple systems, but also do not emphasize neuronal responses. A recent hybrid between evolutionary algorithms and artificial neural network learning rules was designed to mutate learning rules between trials, allowing learning during the lifetime of the robot (Floreano and Mondada, 1998
). Typically, however, the artificial neural networks controlling the evolutionary robots behavior were small (on the order of tens of artificial neural units) and they did not reflect neuroanatomical organization.
In its present form, Darwin VII has several limitations. In comparison to organisms that its behavior mimics, it has an extremely simple nervous system. Re-entrant connections (Edelman, 1987
) within a neural area are present in the model, but re-entrant connections between neural areas, such as A1 and IT, are currently not implemented. This limits intra-modal and cross-modal interactions, making its behavior purely stimulus driven. Moreover, the activity in a simulation cycle is the average of a relatively small population of neurons over 100200 ms, and the spiking dynamics of individual neurons cannot presently be explored with this model. Despite these limitations, Darwin VIIs performance shows that, regardless of the existence of individual variance, neurally based devices acting in the real world can carry out consistent behaviors.
One might ask why the simulation must include behaviors in the real world. Why not simulate the environment as well as the brain? The answer rests in the constructive nature of the brain and behavioral responses to real-world inputs (Chiel and Beer, 1997
; Clark, 1997
). For example, to specify the outlines of an environmental object in a pure computer simulation of the environment would contribute an a priori bias in the form of a detailed albeit implicit instruction. In contrast, by acting in the real world, Darwin VII decides for itself on object properties and then constructs appropriate responses. By using a real-world environment, not only is the risk of introducing biases into the model reduced, but also the experimenter is freed from the substantial burden of constructing a highly complex simulated environment (Edelman et al., 1992
).
Although the world of Darwin VII is much simpler than a real econiche, there does not seem to be a fundamental restriction on constructing a more complex phenotype to deal with a richer environments. Experiments exploring the effects of different neuroanatomical arrangements, the effects of lesions, and of altered synaptic responses are also now possible. As in the present experiments, the behaviors of the resulting brain-based devices would emerge solely as a result of internally generated activity of their nervous systems rather than of responses to any programmed instructions from computer software. Devices of this kind might prove useful in situations of novelty where computation is not possible or in cases of great local complexity where programming proves infeasible. In the near future, such devices are not likely to include behaviors as rich as those of higher vertebrates, and therefore their greatest practical use may at present be to complement computers in a hybrid arrangement, i.e. a brain-based device linked to a conventional digital computer. Since the fundamental operation of such devices includes random fluctuations and unpredictable behaviors, they are not in any strict sense Turing machines. Although the phrase machine psychology may thus appear to be a misnomer, it may be nevertheless be usefully applied to the behavior of non-living things that learn. In any case, providing such synthetic constructions with increasingly sophisticated neural circuits and body forms should give further valuable insights into the relationships between brain, body and behavior.
| Appendix: Specifics of Neuronal Responses, Input and Output in Darwin VII |
|---|
|
|
|---|
A. Computation
The simulated nervous system was run on a Quad Pentium III Xeon Linux workstation capable of communicating with the mobile base. The workstation received visual input via radio frequency (RF) video transmission from a CCD camera mounted on the mobile base (see Appendix, part E). The workstation received auditory and gripper information, and transmitted motor and actuator commands via an RF modem (see Appendix, part F).
B. Neuronal Unit Activity
The total contribution of input to unit i is given by
![]() |
![]() |
![]() |
determined the persistence of unit activity from one cycle to the next,
i is a unit specific firing threshold, and gi is a scale factor. Specific parameter values for neuronal units are given in Table A1C. Activity-dependent Synaptic Plasticity
Activity-dependent synaptic changes in cij were given by
![]() |
is a fixed learning rate,
is a decay constant, and cij(0) is the initial (t = 0) weight of connection cij. The decay constant
governed a passive, uniform decay of synaptic weights to their original starting values. The function F, similar to the BCM learning rule (Bienenstock et al., 1982
1 <
2 < 1), two inclinations (k1, k2) and a saturation parameter
(
= 6 throughout).
![]() |
See Figure 8
inset for a representative chart of the function. Specific parameter settings for fine-scale synaptic connections are given in Table A2
.
D. Value-dependent Synaptic Plasticity
A value term was computed as
![]() |
is the average activity in area S, and v(d 1) is the value of V at time d 1. f is a convolution function that scales the activity over the delay period with values of 0.1, 0.1, 0.3, 0.7, 1.0, 1.0, 0.7, 0.3 and 0.1 for the nine delay increments. The effect of this convolution is to delay onset of the value system activity and spread the activity over time. The synaptic change for value-dependent synaptic plasticity was given by:
![]() |
E. Visual System and its Input
The CCD camera sent 320 x 240 pixel monochrome video images, via an RF transmitter, to an ImageNation PXC200 frame grabber attached to the computer running the neural simulation. The image was c















