Introduction
In everyday life, perceptual events often occur in multiple sensory modalities at once: we hear someone speaking as we see their mouth move. Most scientific investigations have focused on single modalities (frequently vision) in isolation. Recently, there has been increasing interest in studying integration across sensory modalities. In this review, I discuss progress in studying the brain mechanisms of multisensory integration in human lateral occipital-temporal cortex, especially functional magnetic resonance imaging (fMRI) studies of superior temporal sulcus (STS), area LO and area MT (see glossary for a brief definition of these terms). Links between human neuroimaging studies and studies in non-human primates are made using techniques from computational neuroanatomy that permit alignment of human and monkey brains.
An ongoing discussion concerns the appropriate methods for studying multisensory integration using fMRI [1•, 2•, 3]. One important method is to contrast unisensory stimulation conditions with multisensory conditions. The hallmark of multisensory integration is that unisensory stimuli presented in combination produce an effect different from the linear combination of the unisensory stimuli presented separately. In individual neurons, these differences can be quite dramatic, with multisensory responses that are much greater than the sum of individual unisensory responses (‘super-additivity’). However, because fMRI measurements integrate across thousands or millions of neurons, the super-additivity measure might not be appropriate [2•]. Instead, increasingly liberal criteria might be more suitable, such as requiring only that multisensory responses are greater than the maximum or mean of the individual unisensory responses [1•].
Another important issue is the high degree of inter-subject and -laboratory variability observed in fMRI studies. STS, LO and MT are attractive targets for a review because there is some consensus on their anatomical location. This is either because they constitute an anatomical structure observed in every normal human hemisphere (such as STS) or because their response properties make it possible to identify them with functional localizers (somewhat ambiguously for LO, unambiguously for MT). By starting out with well-defined regions, a review can sidestep some of the difficulties inherent in deciding if a stereotaxic coordinate reported in one study of multisensory integration corresponds to the same cortical region as a coordinate from a different study.
Although STS, LO and MT are found in relative proximity, within the space of a few centimeters in human lateral occipital temporal cortex, their multisensory response properties are quite different, as is our level of knowledge about their role in multisensory perception. Therefore, this review attempts to compare and contrast the activity in these three areas in response to stimuli in three sensory modalities — visual, auditory and tactile. Figure 1 illustrates the location of STS, LO and MT in folded and inflated versions of a human brain, and their relationship to Brodmann's cytoarchitectonic classification scheme.Glossary
Area MT (V5): A region in extrastriate visual cortex distinguished by its heavy myelination and specialization for processing visual motion. It was first described in the posterior middle temporal cortex of owl monkey [53], leading to the designation MT. In macaque monkeys, this region lies in the posterior bank of the superior temporal sulcus, where some investigators have designated it V5 [54]. A homologous region has been found in many other species, including humans, where it lies near the junction of the inferior temporal sulcus and the lateral occipital sulcus [55].
Congruent and incongruent stimuli: Because different sensory modalities can be stimulated independently in an experimental setting, multisensory stimuli can be congruent (such as a picture of a car presented with the sound of a car) or incongruent (such as a picture of a car presented with the sound of a telephone).
fMRI (functional magnetic resonance imaging): A non-invasive method for measuring neuronal activity, typically with an indirect measure such as blood-oxygenation level dependent (BOLD) contrast.
Localizer: There is only a rough correlation between visible anatomical structures (such as specific sulci or gyri) and the functional areas that comprise the computational organization of the brain. However, in order to make inferences about organization, it is important to compare the same functional area across subjects. A common technique is to use a localizer fMRI scan (for instance, alternating moving and static stimuli) in order to identify a specific region of interest (for instance, area MT). Additional experiments are then performed and the results compared across subjects within this region.
Multisensory: Refers to the processing of stimuli presented in multiple sensory modalities at once. Although the term ‘multimodal’ is sometimes used as a synonym for multisensory, it is also used to describe studies that use multiple measurement techniques, such as fMRI and magnetoencephalography (MEG). Therefore, the term multisensory is preferred.
Synchronous and asynchronous stimuli: An experimental manipulation that involves artificially changing the temporal offset between stimuli presented in different sensory modalities in order to measure the effect on multisensory integration. For instance, the discomforting sensation when the dialogue in the sound track of a movie is offset from the images.