Cerebral Cortex Advance Access originally published online on November 29, 2006
Cerebral Cortex 2007 17(9):2172-2189; doi:10.1093/cercor/bhl128
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Physiological and Anatomical Evidence for Multisensory Interactions in Auditory Cortex
1 Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3PT, UK, 2 Department of Neurobiology and the Interdisciplinary Center for Neural Computation, Hebrew University, Jerusalem, Israel
Address correspondence to Dr J.K. Bizley, Department of Physiology, Anatomy, and Genetics, Parks Road, Oxford OX1 3PT, UK. Email: jennifer.bizley{at}physiol.ox.ac.uk.
| Abstract |
|---|
|
|
|---|
Recent studies, conducted almost exclusively in primates, have shown that several cortical areas usually associated with modality-specific sensory processing are subject to influences from other senses. Here we demonstrate using single-unit recordings and estimates of mutual information that visual stimuli can influence the activity of units in the auditory cortex of anesthetized ferrets. In many cases, these units were also acoustically responsive and frequently transmitted more information in their spike discharge patterns in response to paired visual–auditory stimulation than when either modality was presented by itself. For each stimulus, this information was conveyed by a combination of spike count and spike timing. Even in primary auditory areas (primary auditory cortex [A1] and anterior auditory field [AAF]),
15% of recorded units were found to have nonauditory input. This proportion increased in the higher level fields that lie ventral to A1/AAF and was highest in the anterior ventral field, where nearly 50% of the units were found to be responsive to visual stimuli only and a further quarter to both visual and auditory stimuli. Within each field, the pure-tone response properties of neurons sensitive to visual stimuli did not differ in any systematic way from those of visually unresponsive neurons. Neural tracer injections revealed direct inputs from visual cortex into auditory cortex, indicating a potential source of origin for the visual responses. Primary visual cortex projects sparsely to A1, whereas higher visual areas innervate auditory areas in a field-specific manner. These data indicate that multisensory convergence and integration are features common to all auditory cortical areas but are especially prevalent in higher areas.
Key Words: cross-modal processing ferret information theory retrograde labeling sensory convergence visual
| Introduction |
|---|
|
|
|---|
Perception of real-world events frequently depends on the synthesis of information from different sensory systems. Revealing where in the brain sensory signals are combined and integrated is key to understanding the basis by which cross-modal processing influences behavior. Recently, studies in both human (Calvert et al. 1999
Evidence for multisensory integration in humans is based on imaging or electroencephalographic/magnetoencephalographic studies (Calvert et al. 1999
; Giard and Peronnet 1999
; Foxe et al. 2000
, 2002
; Molholm et al. 2002
, 2004
; Murray et al. 2005
), which suffer from low spatial resolution and therefore cannot precisely localize regions of convergence. Moreover, intracranial recordings in monkeys are often based on local field potentials and/or multiunit activity (Schroeder et al. 2001
; Schroeder and Foxe 2002
; Fu et al. 2003
; Ghazanfar et al. 2005
). Although these methods allow activity to be localized to particular cortical fields and layers, it can be difficult to determine whether multisensory responses are elicited by individual neurons or by a combination of modality-specific neurons found in close proximity. Furthermore, when using measures of summed activity, it is not possible to correlate the presence of multisensory input with the response characteristics of individual neurons.
Visual influences on neurons in auditory cortex have been reported in awake primates at the single-neuron level, but it was proposed that these emerged as a result of the behavioral training received by the animals (Brosch et al. 2005
). Thus, the extent to which individual neurons in low-level cortical areas traditionally viewed as modality specific are normally involved in multisensory processing remains uncertain. Moreover, all previous electrophysiological studies in this field have investigated multisensory convergence and integration by comparing the number of spikes evoked by different stimuli. Studies of visual (Van Rullen et al. 1998), auditory (Furukawa et al. 2000
; Brugge et al. 2001; Nelken et al. 2005
), and somatosensory (Panzeri et al. 2001; Johansson and Birznieks 2004) processing have now shown that stimulus information can also be encoded by temporal features of the spike discharge pattern. Thus, a full characterization of the sensitivity of neurons to multisensory stimuli requires analytical approaches to be used that take into account different neural coding schemes.
In this study, we show using single-unit recordings and estimates of mutual information (MI) between stimuli and spike trains that units sensitive to visual stimulation are widespread in both primary and nonprimary areas of ferret auditory cortex and that these neurons frequently transmit more information in response to bisensory stimulation than to either auditory or visual stimuli presented by themselves. We also show that ferret auditory cortex receives inputs from several visual cortical areas and from parietal cortex, which could provide the basis for these nonauditory response properties.
| Materials and Methods |
|---|
|
|
|---|
Animal Preparation
All animal procedures were approved by the local ethical review committee and performed under license from the UK Home Office in accordance with the Animal (Scientific Procedures) Act 1986. Nineteen adult pigmented ferrets (Mustela putorius) were used in this study. Eight of these were used exclusively for electrophysiological recordings, and 11 were used for neuroanatomical tract-tracing experiments. All animals received regular otoscopic examinations prior to the experiment to ensure that both ears were clean and disease free.
Anesthesia was induced by 2 mL/kg intramuscular injection of alphaxalone/alphadolone acetate (Saffan; Schering-Plough Animal Health, Welwyn Garden City, UK). The left radial vein was cannulated, and a continuous infusion (5 mL/h) of a mixture of medetomidine (Domitor; 0.022 mg/kg/h; Pfizer, Sandwich, UK) and ketamine (Ketaset; 5 mg/kg/h; Fort Dodge Animal Health, Southampton, UK) in physiological saline containing 5% glucose was provided throughout the experiment. The infusate was supplemented with 0.5 mg/kg/h dexamethasone (Dexadreson; Intervet UK Ltd, Milton Keynes, UK) and 0.06 mg/kg/h atropine sulfate (C-Vet Veterinary Products, Leyland, UK) in order to reduce the risk of cerebral edema and bronchial secretions, respectively. A tracheal cannula was implanted so that the animal could be placed on a ventilator and body temperature, end-tidal CO2, and the electrocardiogram were monitored throughout. The right pupil was dilated by topical application of atropine sulfate and protected with a zero-refractive power contact lens.
The animal was placed in a stereotaxic frame, and the temporal muscles on both sides were retracted to expose the dorsal and lateral parts of the skull. For terminal recording experiments, a metal bar was cemented and screwed into the right side of the skull, holding the head without further need of a stereotaxic frame. On the left side, the temporal muscle was largely removed to gain access to the auditory cortex, which is bounded by the suprasylvian sulcus (Fig. 1) (Kelly et al. 1986
). For tract-tracing experiments, custom-made hollow ear bars were used to maintain stable head position without the need for a headbar. The suprasylvian and pseudosylvian sulci were exposed by a craniotomy. The overlying dura was removed and the cortex covered with silicon oil. The animal was then transferred to a small table in an anechoic chamber (IAC Ltd, Winchester, UK).
|
Stimuli
Acoustic stimuli were generated using TDT system 3 hardware (Tucker-Davis Technologies, Alachua, FL). In 3 recording experiments and all tract-tracing experiments, acoustic stimuli were presented via a closed-field electrostatic speaker (EC1, Tucker-Davis Technologies). In the remaining recording experiments, a Panasonic headphone driver (RPHV297, Panasonic, Bracknell, UK) was used. The electrostatic drivers had a flat frequency output to >30 kHz, whereas the output of the Panasonic drivers extended to 25 kHz. Closed-field calibrations were performed using an one-eighth inch condenser microphone (Brüel and Kjær, Naerum, Denmark), placed at the end of a model ferret ear canal, to create an inverse filter that ensured the driver produced a flat (less than ±5 dB) output. All acoustic stimuli were presented contralaterally.
Pure-tone stimuli were used to obtain frequency-response areas (FRAs), both to characterize individual units and to determine tonotopic gradients in order to identify in which cortical field any given recording was made. The tone frequencies used ranged, in one-third octave steps, from 500 Hz to 24 kHz (Panasonic driver) or 500 Hz to 30 kHz (TDT EC1 driver) and were 100 ms in duration (5 ms cosine ramped). Intensity levels were varied between 10 and 80 dB SPL in 10 dB increments. This totaled 150–200 frequency-level combinations, each of which was presented pseudorandomly
3 times at a rate of once per second. Broadband noise bursts (40 Hz–30 kHz bandwidth and cosine ramped with a 10 ms rise/fall time), generated afresh on every trial, were used as a search stimulus in order to establish whether each unit was acoustically responsive.
The visual stimulus was a diffuse light flash, which was varied in intensity from 0.3 to 70 cd/m2, calibrated with a Tektronix J16 photometer (Bracknell, UK), presented from a light-emitting diode that was usually fixed at a distance of 10 cm from the contralateral eye so that it illuminated virtually the whole contralateral visual field. In order to determine whether units were acoustically and/or visually responsive, 100-ms noise bursts and light flashes were presented separately or simultaneously at a rate of once per second. These stimuli were interspersed with a no-stimulus condition, with presentation pseudorandomized, and each stimulus configuration presented 20–40 times. To eliminate any possibility that responses recorded following presentation of visual stimuli were artifacts of our experimental design, we confirmed that no sound was emitted when the visual stimuli were switched on and off using a Brüel and Kjær one-eighth inch microphone and type 2610 measuring amplifier and that these responses disappeared when the LED was active but covered up. In 2 animals, visual receptive fields were mapped using a flashing LED mounted on a robotic arm (TDT), which allowed the stimuli to be presented at different angles at a distance of 1 m from the animal's eye.
Data Acquisition
Recordings were made with silicon probe electrodes (Neuronexus Technologies, Ann Arbor, MI). In 5 animals, we used electrodes with a 4 x 4 configuration (4 active sites on 4 parallel probes, with a horizontal and a vertical spacing of 200 µm). In a small number of recordings in one of these animals, and for all recordings in a further 3 animals, a single shank electrode was used with 16 active site spaced at 150-µm intervals. The electrodes were positioned so that they entered the cortex approximately orthogonal to the surface of the ectosylvian gyrus (EG). Recordings were made in all auditory cortical fields that were identified in the ferret by Bizley et al. (2005)
, although, because the anterior ectosylvian sulcus (AES) is known to be a multisensory area (Ramsay and Meredith 2004
), there was a sampling bias toward the rostral fields on both the middle ectosylvian gyrus (MEG) and anterior ectosylvian gyrus (AEG).
The neuronal recordings were band-pass filtered (500 Hz–5 kHz), amplified (up to 20 000x), and digitized at 25 kHz. Data acquisition and stimulus generation were performed using BrainWare (Tucker-Davis Technologies).
Data Analysis
Spike sorting was performed off-line. The noise level in the signal was averaged over the preceding second and the trigger level for detecting a spike automatically adjusted to be 3 times this level. Single units were isolated from the digitized signal by manually clustering data according to spike features such as amplitude, width, and area. We also inspected autocorrelation histograms, and only cases in which the interspike interval histograms revealed a clear refractory period were classed as single units.
Data analysis was performed in MATLAB (MathWorks Inc., Natick, MA). Visual latencies were typically, although not always, longer than auditory latencies, and responses to both stimulus modalities lasted for up to 200 ms after stimulus onset. In order to classify whether a unit was responsive to auditory and/or visual stimuli, 2 methods of analysis were used. First, a 2-way analysis of variance (ANOVA) of spike counts over a 200-ms window was performed, in which the 2 binary factors were the presence/absence of an auditory stimulus and the presence/absence of a visual stimulus. This allowed us to quantify whether individual units responded to each form of stimulation and whether there was any interaction resulting from the combined presentation of light and sound. Second, measures of MI were calculated using methods described by Nelken et al. 2005
. Briefly, the stimulus S and neural response R were treated as random variables, and the MI between them, I(S;R), measured in bits, was calculated as a function of their joint probability p(s, r) and defined as
|
| (1) |
1 spike. The naive MI (including the bias) was computed initially using a matrix based on a large number of bins, each with a low probability. The matrix was then reduced, step by step, by joining the rows and columns to create coarser binning and the MI and bias recomputed. The reduction continued until only a single row or column remained, resulting in a set of decreasing MI values and a corresponding set of decreasing bias values. The MI was estimated by the largest difference between the 2. To estimate the MI for the present data set, responses were classified according to the presence or absence of an auditory or visual stimulus (i.e., for a total of 4 stimulus conditions). The spike train was then binned at several time resolutions, ranging from 8 to 256 ms. Because binning is a data reduction step, the maximal MI over all temporal resolutions was considered as the best estimate of the true MI. To assess whether the obtained value was significant, stimuli and responses were randomized and the MI recalculated. The data were bootstrapped in this manner 100 times, and the 99th percentile was extracted from the resulting distribution. If the MI calculated from the data exceeded this value, it was considered to be significant.
To test for an interaction when the 2 stimuli were presented together, the nonstimulus trials were removed from consideration. Then the MI between the responses and the binary classification of unisensory/bisensory stimulation was calculated. This MI is significant when the distribution of responses to bisensory stimulation is different from the distribution of responses to visual and auditory stimuli presented separately. Because this would also be the case for modality-specific neurons, the MI was reestimated after removing possible unisensory auditory effects by randomly intermixing the responses to the light alone with those to the light and sound presented together. The MI was then reestimated after removing the possible contribution of unisensory visual effects by randomizing the sound with the light–sound trials. In both cases, bootstrap was used to build a distribution of MI values. If the MI for the real data exceeded both estimated confidence limits, it was concluded that a significant multisensory interaction was present in the neuronal response. Therefore, units classified as "bisensory" either had a spiking response to both modalities of stimulation when presented independently or had a significant response to one modality of stimulation, which was modulated by the presence of the second stimulus modality that did not, alone, produce a significant response.
In general, we found that the two 2-way ANOVA based on the spike counts and the MI analysis produced results in good accordance with each other (>70% of units were classified in the same manner by each type of test). However, examination of raster plots and derived significance values suggested that the MI analysis was slightly more sensitive, presumably because it takes into account spike timing patterns as well as spike counts (see below). Furthermore, the ANOVA assumes linearity in the contributions of each factor and classified a small population of units that clearly showed nonlinear interactions between the visual and auditory stimuli as unisensory. By contrast, the MI analysis captured such interactions. Consequently, the MI values were used to classify the responses of the cortical units.
In order to quantify the contribution of spike timing information to the MI estimates, a simplified timing statistic, the mean response latency, was computed (see Nelken et al. 2005
). The mean response latency is simply the mean latency of all spikes within the response window and would be equal to the first spike latency if there was only one spike. MI was calculated for each stimulus condition using either the spike count (the total number of spikes over a 200-ms response window) or the mean response latency. It has been shown that, although neither of these statistics alone can capture the full information in spike trains recorded from auditory cortex, jointly they can convey all the available information about the stimuli that is present in the neural response (Nelken et al. 2005
).
The type and magnitude of the multisensory interaction was quantified as in previous studies in this field (Newman and Hartline 1981
; King and Palmer 1985
; Populin and Yin 2002
) using the following formula:
|
| (2) |
Response latencies were calculated from the pooled poststimulus time histogram (PSTH) containing the responses to the appropriate stimulus. Minimum response latencies were computed as the time at which the pooled response first crossed a critical value defined as 20% of the difference between the spontaneous and peak firing rates (as in Bizley et al. 2005
).
Verification of Recording Sites
Following the completion of recordings, animals were perfused (see below). The cortex was removed, gently flattened between glass coverslips, and cryoprotected in 30% sucrose, after which 50-µm tangential sections were cut and stained for Nissl substance. This enabled the location of the recording sites to be examined, thereby ensuring that none of the electrode penetrations had passed through the suprasylvian sulcus and into the adjacent suprasylvian gyrus. Because it was not possible to make lesions at the recording sites, we could not identify which cortical layers the recorded units were located in. However, the tracks made by the silicon probe electrodes were clearly visible in the Nissl-stained sections, and depth measurements were derived from the microdrive readings from the point at which the electrodes entered the cortex.
Tracers
Aseptic surgical techniques were used in all tracer injection experiments. Tracers used were 10% dextran tetramethylrhodamine (10 000 MW, Fluororuby [FR]; Molecular Probes Inc., Eugene, OR), 10% dextran biotin fixable (biotinylated dextran amine [BDA], 10 000 and 3000 MW; Molecular Probes), and 1% cholera toxin subunit ß (CTß, List Biological Laboratories, Campbell, CA).
Tracer injections were, in most cases, made in physiologically identified cortical regions (see Table 1). When physiological verification was not possible, the locations of the tracer injections were assigned to a particular cortical field based on our previous descriptions of ferret auditory cortex (Bizley et al. 2005
). A glass micropipette was lowered, and BDA, FR, or CTß were injected, in most cases, by iontophoresis using a positive current of 5 µA and a duty cycle of 7 s for a duration of 15 min. In a small number of cases, FR and CTß were injected by pressure with a nanoejector (Nanoject II; Drummond Scientific Company, Broomall, PA). Once the injections were complete, the micropipette was withdrawn, the dura lifted back in place, and the piece of cranium that had previously been removed replaced. Sutures were placed in the remaining temporal muscle and skin, so that they could be returned to their preoperative positions. The animals received intraoperative and subsequent postoperative analgesia with Vetergesic (0.15 mL of buprenorphine hydrochloride, intramuscularly; Alstoe Animal Health, Melton Mowbray, UK).
|
Tissue Processing
Survival times were between 2 and 4 weeks, after which transcardial perfusion followed terminal overdose with Euthatal (400 mg/kg of pentobarbital sodium; Merial Animal Health Ltd, Harlow, UK). The blood vessels were washed with 300 mL of 0.9% saline followed by 1 L of fresh 4% paraformaldehyde in 0.1 M phosphate buffer, pH 7.4 (PB). The brain was dissected from the skull, maintained in the same fixative for several hours, and then immersed in 30% sucrose solution in 0.1 M PB for 3 days. In 5 cases, the 2 hemispheres were dissected and gently flattened between 2 glass slides. In those cases, the cortex was later sectioned in the tangential plane; 6 other brains were sectioned in the standard coronal plane. Sections (50-µm thick) were cut on a freezing microtome, and 6 or 7 sets of serial sections were collected in 0.1 M PB. Every third series of sections was used to analyze the tracer labeling.
FR and CTß were visualized with immunohistochemistry reactions, whereas BDA was reacted only with avidin biotin peroxidase (Vectastain Elite ABC Kit; Vector Laboratories, Burlingame, CA). Sections were washed several times in 10 mM phosphate-buffered saline (PBS) with 0.1% Triton X100 (PBS-Tx) and incubated overnight at 4 °C in the primary antibody (FR: anti-tetramethylrhodamine, rabbit immunoglobulin G [IgG]; Molecular Probes; dilution 1:6000; CTß: goat-anti-CTß, dilution 1:15 000). After washing 3 times in PBS-Tx, sections were incubated for 2 h in the biotinylated secondary antibody (biotinylated goat anti-rabbit IgG H + L [FR] or rabbit-anti-goat [CTß], dilution 1:200; Vector Laboratories) at room temperature. Sections were once again washed and incubated for 90 min in avidin biotin peroxidase, washed in PBS, and then incubated with the chromogen solution; 3,3'-diaminobenzidine (DAB; Sigma-Aldrich Company Ltd, Dorset, UK). Sections were incubated in 0.4 mM DAB and 9.14 mM H2O2 in 0.1 M PB until the reaction product was visualized. When BDA and FR or CTß were injected in the same animal, the BDA was first visualized with ABC followed by DAB enhanced with 2.53 mM nickel ammonium sulfate. The second tracer (FR or CTß) was subsequently visualized using the appropriate protocol with DAB only as the chromogen. Reactions were stopped by rinsing the sections several times in 0.1 M PB. Sections were mounted on gelatinized glass slides, air dried, dehydrated, and coverslipped.
For every animal, one set of serial sections was counterstained with 0.2% cresyl violet, another set was selected to visualize cytochrome oxidase (CO) activity, and a third set was used to perform SMI32 immunohistochemistry to aid identification of different cortical areas and laminae. CO staining was obtained after 12 h incubation with 4% sucrose, 0.025% cytochrome C (Sigma-Aldrich), and 0.05% DAB in 0.1 M PB at 37 °C. To stain neurofilament H in neurons, we used a monoclonal mouse anti-SMI32 (dilution 1:4000; Sternberg Monoclonals, Inc., Latherville, MA). After immersion for 60 min in a blocking serum solution with 5% normal horse serum, the sections were incubated overnight at 5 °C with the mouse antibody and 2% normal horse serum in 10 mM PBS. Mouse biotinylated secondary antibody was used after brief washings in 10 mM PBS (mouse ABC kit, dilution 1:200 in PBS with 2% normal horse serum; Vector Laboratories). Immunoreaction was followed by several washings in PBS, incubation in ABC, and visualization using DAB with nickel–cobalt intensification (Adams 1981
).
Histological Analysis
Sections were analyzed with a Leica DMR microscope fitted with a digital Leica camera using TWAIN software (Leica Microsystems, Heerbrugg, Switzerland). The locations of the labeled cells were plotted using a camera lucida onto drawings of the cortex produced from adjacent Nissl-stained sections.
| Results |
|---|
|
|
|---|
Ferret auditory cortex lies on the EG, where 6 different fields have been defined physiologically (Bizley et al. 2005
Data are presented here from recordings made from 756 single units in the left EG whose responses were significantly modulated by acoustic, visual, and/or multisensory stimulation (Bajo et al. 2006). Many other acoustically responsive units were recorded in these animals, but these were not tested with visual stimuli and are not considered further here. Data from each animal were initially examined separately, and, after ensuring that a consistent trend in the distribution of these responses was present in all animals, the data were pooled across subjects. Recording sites were assigned to different cortical fields based on their locations on the surface of the EG plus subsequent histology, as well as the frequency tuning and other response properties of the units (Bizley et al. 2005
).
The raster plots in Figure 2 illustrate the range of responses evoked by these units to our standard stimuli used to investigate visual–auditory interactions. These stimuli comprised 100 ms noise bursts presented to the contralateral ear, 100 ms light flashes presented within the contralateral visual field, or both presented simultaneously. The symbols at the top right of each panel indicate whether significant responses or visual–auditory interactions were observed (according to the MI values, see Materials and Methods). In Figure 2A, the unit responded robustly to acoustic stimulation, whereas the visual stimulus was ineffective in changing the spike discharge pattern of the unit, either by itself or in combination with the sound. By contrast, Figure 2B,C shows units whose responses were clearly modulated by both auditory and visual stimulation. In each case, bisensory stimulation evoked spike discharges in which components of the responses to each stimulus modality could be discerned by virtue of their different temporal firing patterns. Figure 2D shows a unit that responded to visual but not to auditory stimulation. In Figure 2E, the unit did not respond to visual stimulation alone, although its auditory response was enhanced when light flashes were presented simultaneously. Finally, Figure 2F shows an example of a unit in which the only significant response was obtained with combined visual–auditory stimulation.
|
Cortical Location of Modality-Specific and Multisensory Units
All units were classified as auditory, visual, or bisensory. Visual–auditory units included those exhibiting a clear spiking response to both stimulus modalities (e.g., Fig. 2B,C) and those for which the MI analysis revealed a significant interaction, that is, where responses to bisensory stimulation were significantly different from those evoked in either unisensory condition (e.g., Fig. 2E,F). Recording sites were plotted onto an image of the exposed cortex to form maps showing the location of each response type. Maps for 2 animals are shown in Figure 3A,B. In these plots, the blue dots indicate the location of units that were classified as unisensory auditory, green triangles the location of the unisensory visual units, and red diamonds the location of units displaying bisensory responses. Different cortical fields on the EG have been delimited (dashed lines) on the basis of the frequency-tuning properties of all the acoustically responsive units recorded in these animals (see Fig. 1 and Bizley et al. 2005
). Often different symbols overlap due to the multiple recording sites on a single probe (this is especially true for Fig. 3B because in this animal recordings were made with a 16-channel single-shaft probe).
|
The most striking feature in Figure 3 is the incidence of bisensory units in each of the identified auditory cortical fields. Unisensory visual units were also widespread and were even found near the edges of the primary auditory fields, A1 and AAF, at the tip of the EG. Some of these units extended into the ventral bank of the suprasylvian sulcus, where AAF is located (see unfolded regions near the tip of the gyrus in Fig. 3A), but were not located on the suprasylvian gyrus. Figure 3C shows the proportions of each response type found in each cortical field. It should be noted that there was a sampling bias (for reasons discussed in the Materials and Methods) toward the anterior fields (AAF, ADF, AVF), and therefore, the total number of units recorded in these regions is higher than in the fields located on the posterior side of the EG. Even in primary auditory areas (A1 and AAF),
15% of recorded units were found to have nonauditory input. This proportion increased in the higher level fields that lie ventral to A1/AAF and was highest in area AVF, where nearly 50% of the units were found to be responsive to visual stimuli only and a further quarter to both visual and auditory stimuli. Response Latencies
First spike response latencies were calculated for the visual response in all units whose responses were significantly modulated by light alone (n = 148), and auditory first spike latencies were calculated for all bisensory units exhibiting a significant unisensory response following contralateral stimulation with broadband noise bursts (n = 113). These are plotted, for each cortical area, in Figure 4A,B. Units ranged in their visual first spike latency from
40 to >200 ms, whereas most auditory latencies were <50 ms. Visual latencies in AAF were significantly shorter than those in most other cortical fields, whereas the longest latencies were found in field AVF. Although there were no significant interareal differences in auditory response latencies for this population of cells, the distribution of first spike latencies across different cortical areas followed a similar pattern to that previously described with pure-tone stimuli (Bizley et al. 2005
), with the posterior fields (PPF and PSF) having the longest latencies.
|
Spatial Receptive Fields
In 2 of the animals, an LED mounted on a robotic arm at a distance of 1 m from the animal's head was used to map the visual spatial receptive fields. This visual stimulus was, in some cases, accompanied by simultaneous presentation of a contralateral noise burst. Figure 5A shows how the visual response of a unit recorded in AAF varied with the azimuthal angle of the LED; its response was clearly restricted to a region of contralateral space. The visual azimuth response profile of another unit, this time from ADF, is depicted in Figure 5B. The magnitude of the visual response is shown in the presence and absence of auditory stimulation. Again, the visual response was clearly tuned to the anterior contralateral quadrant and a significant interaction was found between the location of the light and the presence of the sound (P < 0.001), indicating that the addition of the auditory stimulus sharpened the spatial selectivity of the visual response. A total of 39 bisensory cells had their visual spatial receptive fields mapped in this fashion. In 31 of these units, there was a significant effect on the response of varying the location of the LED (2-way ANOVA with light position and presence of sound as factors). Azimuth profiles showing how the responses of all these units varied as the location of the robotic-arm–mounted LED was changed while presenting noise to the contralateral ear are plotted in Figure 5C. Although some units showed regions of increased and decreased activity at different LED locations, the majority were contralaterally tuned, as can be seen in the average spatial response profile.
|
Varying Stimulus Intensity
Because we wished to sample as many units as possible with our multielectrode arrays, we typically used a fairly intense visual stimulus in order to determine the incidence of visually sensitive units in different auditory cortical fields. In a number of cases, however, we also used much lower stimulus levels and found that these were also effective in generating responses. Auditory and visual thresholds are plotted for 31 bisensory units in Figure 6A. Many of these had very low thresholds. The thresholds for a further 21 visual units recorded from a range of auditory cortical fields are also plotted. An example of a bisensory unit with a low threshold for each stimulus modality is shown in Figure 6B. This unit, which had clear responses to both auditory and visual stimulation presented separately, had a visual response threshold of 0.5 cd/m2 and an auditory threshold of 34 dB SPL.
|
Multisensory Interactions in Auditory Cortex
The magnitude of the visual–auditory interactions exhibited by units that were classified as bisensory was quantified using equation (2). This measures how different the multisensory response is from the linear sum of the responses to unisensory stimulation. Therefore, a bisensory unit that sums its inputs linearly (i.e., an additive interaction) or a unisensory unit whose response is unmodulated by the other stimulus modality will be deemed to show no cross-modal facilitation or occlusion according to this equation.
The responses of the 6 units that are shown in Figure 2 are replotted (as mean ± standard error of mean spike rates for each stimulus condition) in Figure 7, together with their response modulation values from equation (2). Units B, C, E, and F were classified as bisensory, either because they gave significant responses to both visual and auditory stimuli or because the responses to bisensory stimulation conveyed significantly more information than either of the unisensory responses. Units B and C exhibited sublinear interactions or cross-modal occlusion. By contrast, a significant facilitation or superadditive interaction was observed in the responses of units E and F; in the most extreme case (unit F), neither visual nor auditory stimuli were effective at driving the neuron, whereas bisensory stimulation produced a clear response. Such extreme cases of facilitation were relatively uncommon and were usually observed when the responses to unisensory stimulation were particularly weak.
|
Figure 8A shows the distribution of cross-modal response modulation values for units classified as bisensory. Most units exhibited linear or sublinear interactions in their spike discharge rates and in only a few cases, were superadditive effects of the sort shown in Figure 5E,F observed. When looking for bisensory interactions, the intensity of both stimuli was kept constant at a relatively high value. The incidence of cross-modal facilitation in the superior colliculus is known to decrease as stimulus intensity is increased (Meredith and Stein 1986
|
In order to compare the cross-modal response modulation values obtained with spike counts to the MI estimates, we used the same formula (eq. 2) but substituted the spike counts by the corresponding MI values (in bits). The distribution of response modulation values (Fig. 8B) was very similar, and both measures tended to reveal the same type of cross-modal interaction for individual units.
By calculating the MI associated with each stimulus condition (compared with spontaneous activity), we were able to show that in the majority of units classified as "bisensory" there was an increase in the transmitted information when combined visual–auditory stimuli were presented compared with the most effective unisensory stimulus. Figure 9 plots the MI value obtained in the bisensory condition against that for the most effective unisensory condition. The majority of points fell above the line of unity, indicating that there was more information in the response when visual and auditory stimulation was combined.
|
Information in Spike Timing
As previously mentioned, we found that the MI analysis provided a more sensitive index for the presence of multisensory responses than ANOVA tests based on spike count. We explicitly tested the hypothesis that this is the case because the MI analysis takes into account stimulus-related variations in spike timing as well as the overall spike count. It has previously been shown for auditory cortex that spike count and mean response latency can together capture the total information in the full spike discharge pattern (Nelken et al. 2005
). The mean response latency is a reduced timing measure and is the average latency of all spikes in the response window and would be equal to the first spike latency when there is only one spike. We therefore calculated the MI based on each of these 2 measures for all our recorded responses and compared the relative contributions of spike timing and spike count.
This analysis is presented in Figure 10A, which plots, for all units in which there was a significant response for that stimulus, the relative information in spike count and spike timing. Points lying above the line of unity indicate that the unit transmits more information in its mean response latency than in its spike count. The MI calculated from mean response latencies exceeded the MI from spike counts in 56% of auditory responses (crosses), 70% of visual responses (diamonds), and 52% of bisensory (triangles) responses. The most dramatic differences were obtained when the spike count information was relatively low due to there being little difference between the stimulus-evoked response and the spontaneous activity. Although the recorded spike counts suggested that the stimulus was relatively ineffective in these cases, there was often clear time-locked activity in response to the stimulus, which the mean response latency measure was able to extract. The response of one such unit is shown in Figure 10B.
|
Altering Stimulus Onset Times
A closer examination of the nature of the multisensory interactions was performed by altering the relative timing of the auditory and visual stimuli. In 241 units in which there was no visual response and no multisensory interaction was detectable following simultaneous presentation of the 2 stimuli,
6% (14 units) showed suppression of the auditory response when the visual stimulus was presented either 100 or 200 ms prior to the acoustic stimulus. An example of such a neuron is shown in Figure 11. This unit responded robustly to a noise burst but was unaffected by a light stimulus, presented either alone or simultaneously with the sound. However, presentation of the visual stimulus 100 or 200 ms prior to the auditory stimulus caused a significant suppression of the response to noise. Such data show that multisensory interactions can be observed only in response to very specific stimulus configurations.
|
Distribution in Cortical Depth
Information about the origin of visual inputs to auditory cortex can be obtained by examining the laminae in which multisensory interactions occur. Because the silicon probe electrodes used in these experiments do not allow an accurate histological reconstruction of the depths of the recording sites, we are unable to match recordings to specific cortical lamina. However, as the probes had multiple recording sites and recordings were made orthogonal to the surface of the cortex, it was possible to examine the relative depth of the multisensory responses.
We examined the distribution of unisensory and bisensory responses across the recording sites on our silicon probes (both the 4 x 4 and 16 x 1 configuration of recording sites). Visually sensitive units could be found at all the cortical depths sampled. Without histological verification of individual recording sites or a current source density analysis of local field potentials, it was not possible to confirm the laminar origin of these recordings. Nevertheless, these data suggest that visual inputs to auditory cortex are not restricted to particular layers.
Auditory Response Characteristics of Bisensory Neurons
The frequency-tuning properties of all acoustically responsive units were assessed in order to examine the characteristics of those units that also received visual inputs. This was done by measuring the Q10 (bandwidth at 10 dB above threshold divided by best frequency; high values indicate narrow frequency tuning). A 2-way ANOVA (with cortical area and response modality as factors) revealed a significant difference in tuning between different areas (F5,614 = 3.93, P < 0.01), but not between the auditory and visual–auditory units (F1,614 = 0.23, P = 0.23).
Figure 12A,B plots the responses of 2 units selected to demonstrate that bisensory neurons exhibited a range of frequency-tuning characteristics, which were usually typical of the field in which they were recorded. The first column shows the raster plot of each unit in response to auditory, visual, and combined visual–auditory stimulation. The central column shows the pooled PSTH in response to 3 repetitions of each of the pure-tone stimuli used to construct the FRA, which is shown in the third column. Unit A is from AAF, and unit B is from ADF. These units have FRAs that are highly typical for their respective fields, with the AAF unit having a short latency onset response and sharp frequency tuning, whereas the ADF response was more sustained and broadly tuned. Both of these units were also visually responsive but did not exhibit significant cross-modal interactions. A comparison of the Q10 values for unisensory auditory and bisensory units recorded in each of 6 cortical fields is shown in Figure 6C. No significant differences in auditory frequency tuning were found according to whether the units were sensitive to visual stimulation or not.
|
Anatomical Connectivity: Potential Cortical Sources of Visual Input
A total of 20 separate tracer injections were made into the auditory cortices of 11 ferrets as indicated in Table 1. In all these cases, the injection sites spanned all cortical layers but did not spread to the white matter. To serve the aim of this investigation, only connections between auditory cortex and other nonauditory sensory cortical areas will be described. Visual, somatosensory, and parietal areas of the ferret cortex were identified on the basis of previous anatomical and physiological studies (Innocenti et al. 2002
; Manger et al. 2002
, 2004
).
The number of labeled neurons and the quality of filling varied with the tracer injected, presumably reflecting differences in the size of the injection sites and in the sensitivity and diffusion characteristics of different tracers. Although this prevented us from making quantitative comparisons of the projections arising from different injection sites, the use of different tracers avoided the limitations associated with any particular tracer and allowed us to inject two or more different tracers in the same cortex. In each case, these injections resulted in labeling in nonauditory sensory areas that was predominantly ipsilateral to the injection site and, for injections placed in the same areas, comparable in its distribution among different tracers. Retrograde labeling was also examined outside of the cerebral cortex, which, as expected, revealed labeling in different parts of the medial geniculate body, but not in the lateral geniculate or in the midbrain. Labeling was observed in the suprageniculate nucleus after tracer injections into areas on the anterior bank (ADF and AVF, data not shown).
In the cortex, retrogradely labeled cells were found in several nonauditory areas after tracer injections in the EG (Figs 13–15). Labeled cells were consistently observed in visual areas 17, 18, 19, and 20, as well as the suprasylvian cortex (SSY), and posterior parietal cortex, although their distribution varied according to the location of the injection sites. These cells were located mainly in cortical layers III and V.
Figure 13 shows the location of retrogradely labeled cells in the cortex following injections of 2 different tracers in the MEG, where the primary auditory fields, A1 and AAF, are located. Sparse labeling was found in areas 17, 18, and 20 (Fig. 13C–G), caudal posterior parietal cortex (PPc, Fig. 13D), and in SSY (Fig. 13D). The greatest number of labeled cells was found in area 20, which has been subdivided into areas 20a and 20b (Manger et al. 2004
). Labeling was densest in area 20b, the smaller and more anterior of the 2 fields, with fewer labeled cells present in area 20a. Labeling in areas 17 and 18 was found near their dorsal and ventral borders, corresponding to where the peripheral visual field is represented (Law et al. 1988
). Occasional cells were observed in the AES cortex, which, in the ferret, lies within the anterior bank of the pseudosylvian sulcus and receives inputs from primary visual and somatosensory areas (Ramsay and Meredith 2004
; Manger et al. 2005
). Tracer injections placed into the MEG in other animals produced patterns of labeling consistent with those shown in Figure 13. Injections into both caudal MEG, corresponding to A1, and more rostral MEG, corresponding to AAF, produced very similar patterns of labeling in nonauditory areas.
|
Figure 14 shows the pattern of cortical labeling observed after injections of BDA and CTß were placed in the AEG and posterior ectosylvian gyrus (PEG), respectively. The labeling produced by each tracer was almost entirely nonoverlapping, indicating that the higher level auditory cortical areas located on these 2 sides of the EG have different sources of input. The injection of CTß in the PEG resulted in extensive labeling in the MEG, together with some in AES cortex and in the most rostral and ventral parts of the AEG, which is probably corresponding to limbic areas (Fig. 14D,E). As in the MEG, the predominant nonauditory input to the PEG originated in areas 20a and 20b (Fig. 14E). Scattered cells were also observed in SSY and area 19 (Fig. 14F), although labeling in primary visual areas and PPc was virtually absent. The injection of CTß in this animal was made in the center of the PEG (Fig. 14A,C), probably spanning the common low-frequency border of fields PPF and PSF (Bizley et al. 2005
|
In contrast to the MEG and PEG, the primary nonauditory input to the AEG arose from SSY (Figs 14E and 15C,D). Sparser labeling was observed in area 20a and 20b (Figs 14E and 15D), PPc (Figs 14F,H and 15C,D), and occasionally in areas 17 and 18 (Figs 14F and 15F–H). The single CTß injection illustrated in Figure 15 was placed in the center of the AEG, did not spread as far as the pseudosylvian sulcus, and therefore did not include AES cortex. In contrast, the BDA injection into the AEG shown in Figure 14 did extend into the sulcus, most likely including AES cortex. Nevertheless, the pattern of labeling that resulted from these 2 injections was very similar.
|
In summary, these tracer injections reveal the presence of projections to auditory cortex from several different visual cortical areas. The strongest inputs to auditory fields located on both the MEG and the PEG arise from area 20. Additionally, MEG receives weak, direct projections from primary visual areas, whereas PEG is innervated by higher visual areas. This differs from the AEG, where the largest input originates in SSY.
| Discussion |
|---|
|
|
|---|
The traditional view that integration of information across the senses occurs in the cortex only after substantial modality-specific processing has taken place has recently been cast into doubt by the discovery that multisensory interactions are prevalent in low-level cortical areas (reviewed by Schroeder and Foxe 2005










denotes visual responses,
denotes responses to bisensory stimulation. (B) Example raster plot in which the MI obtained using mean response latency (0.44 bits for the bisensory condition) exceeded that obtained using spike counts (0.13 bits) alone.



