Trends in Cognitive Sciences
OpinionObject-based auditory and visual attention
Introduction
At a cocktail party, the sounds of clinking glasses and exuberant voices add acoustically before entering your ears. To appreciate your companion's anecdote, you must filter out extraneous sources (see Glossary) and focus attention on her voice. At the same time, the sounds that you tune out are crucial for maintaining awareness of your environment. Indeed, a source of interference (the pompous man on your right) might become the very source you want to understand in the next moment (e.g. when you realize he is relaying a juicy story about your boss). To maneuver successfully in everyday settings, you need to be able to both focus and shift attention as the need arises.
Theories of visual attention explain many striking perceptual phenomena that arise when viewing complex scenes, from change blindness (failure to notice a change in a visual scene because attention is directed to another part of the image) to performance on visual search tasks 1, 2. Although there is much current interest in how central limitations interfere with auditory perception, there is no comprehensive framework to explain our ability to understand sound sources in complex acoustic scenes. Here, I argue that many auditory phenomena, including how we manage to converse at a cocktail party, can be understood by properly extending theories of visual attention. This commonality supports the idea that the same neural processes control visual and auditory attention [3].
Section snippets
Auditory objects
Theories of visual attention argue that observers focus attention on an object in a complex scene [2]. Unfortunately, just as in vision [4], it is difficult to define what constitutes an object in audition. This difficulty arises, in part, because there are few absolute rules governing auditory object formation. Audible sound in a mixture is not always allocated between the objects perceived in a scene, and can contribute either to multiple objects 5, 6 or to no object [7]. The state of the
Object formation
In a visual scene, objects form locally based on contiguous geometric structure, such as edges, boundaries and contours [4]. Discrete local patches can be perceptually linked, based on similarity of texture, color and other features, to form whole objects [4].
In a similar way, auditory objects form across different analysis scales. For sound elements with contiguous spectro-temporal structure, formation relies primarily on this local structure 12, 13, including common onsets and offsets,
Object-based attention
Object formation directly influences how we perceive and process complex scenes. In all sensory modalities, the normal mode of analyzing a complex scene is to focus on one object while other objects are in the perceptual background 17, 18. In vision, this mode of perceiving is described as a biased competition between perceptual objects [2]. Biased competition takes place automatically and ubiquitously when there are multiple objects in a scene. Which object wins the competition depends both on
Understanding perception of complex scenes
Because attention is object based, competing sources in a complex scene can cause many different forms of perceptual interference, some of which are considered below. An overview of the interactions affecting auditory perception is shown in Figure 1.
Summary
In both vision and audition, we direct top-down attention to select desired objects from a complex scene. Because perceptual objects are the basic units of attention, proper object formation is crucial to this ability. Stimulus structure determines how objects form locally, either in space-time (for visual objects) or time-frequency (for auditory objects). Higher-order perceptual attributes enable both object formation across larger scales and selection of a desired object from a complex scene.
Acknowledgements
Grants from NIDCD, AFOSR, ONR and NSF supported this work. These ideas were developed through discussions with Gin Best, Antje Ihlefeld, Erick Gallun, Chris Mason, Gerald Kidd, Steve Colburn and Nat Durlach.
Glossary
- Energetic masking
- perceptual interference present in the sensory epithelium.
- Informational masking
- perceptual interference that cannot be explained by energetic masking.
- Object
- a perceptual estimate of the content of a discrete physical source.
- Salience
- the perceptual strength of an input based purely on stimulus attributes.
- Similarity
- a putative explanation for auditory informational masking when a target and competing sources have similar perceptual features.
- Source
- a discrete physical entity in the
References (49)
- et al.
Change blindness: past, present, and future
Trends Cogn. Sci.
(2005) What is a visual object?
Trends Cogn. Sci.
(2003)- et al.
Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization
Curr. Biol.
(2006) - et al.
Auditory grouping
How the brain separates sounds
Trends Cogn. Sci.
(2004)Parietal mechanisms of switching and maintaining attention to locations, objects, and features
Objects and attention: the state of the art
Cognition
(2001)- et al.
Neural mechanisms of selective visual attention
Annu. Rev. Neurosci.
(1995) Preparatory activity in visual cortex indexes distractor suppression during covert spatial orienting
J. Neurophysiol.
(2004)- et al.
Limits on phonetic integration in duplex perception
Percept. Psychophys.
(1996)
Perceiving vowels in the presence of another sound: a quantitative test of the “old-plus-new” heuristic
A sound element gets lost in perceptual competition
Proc. Natl. Acad. Sci. U. S. A.
Effects of location, frequency region, and time course of selective attention on auditory scene analysis
J. Exp. Psychol. Hum. Percept. Perform.
The role of attention in the formation of auditory streams
Percept. Psychophys.
Effects of attention and unilateral neglect on auditory stream segregation
J. Exp. Psychol. Hum. Percept. Perform.
Auditory Scene Analysis: The Perceptual Organization of Sound
Some characteristics of auditory spatial attention revealed using rhythmic masking release
Percept. Psychophys.
Effectiveness of spatial cues, prosody, and talker characteristics in selective attention
J. Acoust. Soc. Am.
EPS Mid-Career Award 2004: brain mechanisms of attention
Q. J. Exp. Psychol. (Colchester)
Configural and contextual prioritization in object-based attention
Psychon. Bull. Rev.
How visual salience wins the battle for awareness
Nat. Neurosci.
Fundamental components of attention
Annu. Rev. Neurosci.
The spread of attention across modalities and space in a multisensory object
Proc. Natl. Acad. Sci. U. S. A.
Parietal cortex mediates voluntary control of spatial and nonspatial auditory attention
J. Neurosci.
Cited by (521)
Automating medical simulations
2023, Journal of Biomedical InformaticsConceptualising acoustic and cognitive contributions to divided-attention listening within a data-limit versus resource-limit framework
2023, Journal of Memory and Language