Elsevier

NeuroImage

Volume 26, Issue 2, June 2005, Pages 592-599
NeuroImage

Left thalamo-cortical network implicated in successful speech separation and identification

https://doi.org/10.1016/j.neuroimage.2005.02.006Get rights and content

Abstract

The separation of concurrent sounds is paramount to human communication in everyday settings. The primary auditory cortex and the planum temporale are thought to be essential for both the separation of physical sound sources into perceptual objects and the comparison of those representations with previously learned acoustic events. To examine the role of these areas in speech separation, we measured brain activity using event-related functional Magnetic Resonance Imaging (fMRI) while participants were asked to identify two phonetically different vowels presented simultaneously. The processing of brief speech sounds (200 ms in duration) activated the thalamus and superior temporal gyrus bilaterally, left anterior temporal lobe, and left inferior temporal gyrus. A comparison of fMRI signals between trials in which participants successfully identified both vowels as opposed to when only one of the two vowels was recognized revealed enhanced activity in left thalamus, Heschl's gyrus, superior temporal gyrus, and the planum temporale. Because participants successfully identified at least one of the two vowels on each trial, the difference in fMRI signal indexes the extra computational work needed to segregate and identify successfully the other concurrently presented vowel. The results support the view that auditory cortex in or near Heschl's gyrus as well as in the planum temporale are involved in sound segregation and reveal a link between left thalamo-cortical activation and the successful separation and identification of simultaneous speech sounds.

Section snippets

Participants

A total of 11 right-handed participants whose native language was English were recruited for the present study. Two participants were excluded from the analysis because they performed near ceiling (89 and 95% accuracy, respectively) and consequently there were not enough incorrect trials to be analyzed. Nine participants (4 women and 5 men aged between 21 and 30 years, mean age = 26 ± 3.5 years) formed the final sample. None had any history of hearing, neurological, or psychiatric disorders.

Behavioral results

Fig. 1 shows the group mean proportion of trials on which both vowels were correctly identified as a function of f0 difference (Δf0). Participants performed well above chance even when the two vowels shared the same f0. The increase in Δf0 led to moderate, albeit significant, improvement in vowel identification, F(4,32) = 3.78, P < 0.051. Pairwise comparisons

Discussion

Auditory streaming is a critical stage in auditory perception that permits listeners to identify the various sound sources contained in the incoming acoustic wave. Here, using fMRI, enhanced brain activity was observed in left thalamus, superior temporal gyrus, Heschl's gyrus, and in the planum temporale when participants successfully identified two concurrently presented vowels as opposed to when only one of the two vowels was recognized. The results suggest that success in speech separation

Acknowledgments

This work was supported by grants from the Canadian Institute of Health Research and the Natural Sciences and Engineering Research Council of Canada. We wish to thank B. Dyson and J. Snyder for their comments on the manuscript and valuable discussion. We are particularly indebted to Peter Assmann and Quentin Summerfield for providing the vowel stimuli and to Virginia Penhune and Robert Zatorre for providing the statistical maps of the primary auditory cortex.

References (46)

  • E. Sussman et al.

    Top–down effects can modify the initially stimulus-driven auditory organization

    Brain Res. Cogn. Brain Res.

    (2002)
  • C. Alain et al.

    Effects of attentional load on auditory scene analysis

    J. Cogn. Neurosci.

    (2003)
  • C. Alain et al.

    “What” and “where” in the human auditory system

    Proc. Natl. Acad. Sci. U. S. A.

    (2001)
  • C. Alain et al.

    Bottom–up and top–down influences on auditory scene analysis: evidence from event-related brain potentials

    J. Exp. Psychol. Hum. Percept. Perform.

    (2001)
  • C. Alain et al.

    Age-related changes in detecting a mistuned harmonic

    J. Acoust. Soc. Am.

    (2001)
  • C. Alain et al.

    Neural activity associated with distinguishing concurrent auditory objects

    J. Acoust. Soc. Am.

    (2002)
  • Alain, C., Reinke, K.S., He, Y., Wang, C., Lobaugh, N., in press. Hearing two things at once: neurophysiological...
  • P. Assmann et al.

    The contribution of waveform interactions to the perception of concurrent vowels

    J. Acoust. Soc. Am.

    (1994)
  • M.A. Bee et al.

    Primitive auditory stream segregation: a neurophysiological study in the songbird forebrain

    J. Neurophysiol.

    (2004)
  • J.R. Binder et al.

    Human temporal lobe activation by speech and nonspeech sounds

    Cereb. Cortex

    (2000)
  • M.H. Chalikia et al.

    The perceptual segregation of simultaneous auditory signals: pulse train segregation and vowel segregation

    Percept. Psychophys.

    (1989)
  • R.W. Cox et al.

    Software tools for analysis and visualization of fMRI data

    NMR Biomed.

    (1997)
  • B. Dyson et al.

    Representation of concurrent acoustic objects in primary auditory cortex

    J. Acoust. Soc. Am.

    (2004)
  • Cited by (56)

    • Noise and pitch interact during the cortical segregation of concurrent speech

      2017, Hearing Research
      Citation Excerpt :

      Segregation of speech and non-speech signals is thought to reflect a complex, distributed neural network involving both subcortical and cortical brain regions (Alain et al., 2005b; Bidelman and Alain, 2015a; Dyson and Alain, 2004; Palmer, 1990; Sinex et al., 2002). In humans, functional magnetic resonance imaging (fMRI) implicates a left thalamo-cortical network including thalamus, bilateral superior temporal gyrus, and left anterior temporal lobe in successful double-vowel segregation (Alain et al., 2005b). Event-related brain potentials (ERPs) have further delineated the time course of concurrent speech processing, with modulations in neural activity ∼150–200 ms and 350–400 ms after sound onset (Alain et al., 2005a, 2007; Reinke et al., 2003)1.

    • Theta oscillations accompanying concurrent auditory stream segregation

      2016, International Journal of Psychophysiology
    • Auditory-limbic interactions in chronic tinnitus: Challenges for neuroimaging research

      2016, Hearing Research
      Citation Excerpt :

      Given the role the lateral prefrontal cortex plays in conscious and effortful cognitive control (Miller and Cohen, 2001; Miyake et al., 2000), this effect most likely represents the increased effort needed to ignore a loud tinnitus signal during our auditory task. As another example, our group and others have noted a relationship between tinnitus and increased activity in posterior auditory cortex (Giraud et al., 1999; Leaver et al., 2011; Lockwood et al., 2001; Reyes et al., 2002), which has been implicated in separating multiple auditory signals (e.g., listening to a single voice at a cocktail party; Alain et al., 2005; Wilson et al., 2007). In these examples, different cognitive processes (and associated neural substrates) direct attention away from the tinnitus signal to other sensory events and may temporarily attenuate the tinnitus percept.

    View all citing articles on Scopus
    View full text