Javascript must be enabled to continue!

Cortical Representations of Speech in a Multi-talker Auditory Scene

Abstract The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically-based representations in the auditory nerve, into perceptually distinct auditory-objects based representation in auditory cortex. Here, using magnetoencephalography (MEG) recordings from human subjects, both men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in auditory cortex contain dominantly spectro-temporal based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. In contrast, we also show that higher order auditory cortical areas represent the attended stream separately, and with significantly higher fidelity, than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Taken together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of human auditory cortex. Significance Statement Using magnetoencephalography (MEG) recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of auditory cortex. We show that the primary-like areas in auditory cortex use a dominantly spectro-temporal based representation of the entire auditory scene, with both attended and ignored speech streams represented with almost equal fidelity. In contrast, we show that higher order auditory cortical areas represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects.

openRxiv

Krishna C. Puvvada Jonathan Z. Simon

2017

Title: Cortical Representations of Speech in a Multi-talker Auditory Scene

Description:

Abstract The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system.

Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically-based representations in the auditory nerve, into perceptually distinct auditory-objects based representation in auditory cortex.

Here, using magnetoencephalography (MEG) recordings from human subjects, both men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of auditory cortex.

Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in auditory cortex contain dominantly spectro-temporal based representations of the entire auditory scene.

Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately.

In contrast, we also show that higher order auditory cortical areas represent the attended stream separately, and with significantly higher fidelity, than unattended streams.

Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects.

Taken together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of human auditory cortex.

Significance Statement Using magnetoencephalography (MEG) recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of auditory cortex.

We show that the primary-like areas in auditory cortex use a dominantly spectro-temporal based representation of the entire auditory scene, with both attended and ignored speech streams represented with almost equal fidelity.

In contrast, we show that higher order auditory cortical areas represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams.

Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects.

Back

Speech perception depends on the ability to generalize previously experienced input effectively across talkers. How such cross-talker generalization is achieved has remained an ope...

Recognizing voices through a cochlear implant: A systematic review

Objective: Some cochlear implant (CI) users report having difficulty accessing indexical information in the speech signal, presumably due to the transformation from acoustic to ele...

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers (“turn-taking” listening scenario). Pr...

Talker variability facilitates the statistical learning of speech sounds

Natural speech contains many sources of acoustic variability both within and between talkers, which challenges speech recognition in some contexts but may facilitate language under...

Effects of talker uncertainty I: Auditory word recognition

The production and resulting acoustic composition of spoken words vary as functions of individual talker characteristics. However, the effects of talker differences on auditory wor...

Why are listeners hindered by talker variability?

AbstractThough listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower t...

Neural Speech-Tracking During Selective Attention: A Spatially Realistic Audiovisual Study

Abstract Paying attention to a target talker in multi-talker scenarios is associated with its more accurate neural-tracking relative to competing non-target speech....

Gender Effects on Binaural Speech Auditory Brainstem Response

BACKGROUND: The speech auditory brainstem response is a tool that provides direct information on how speech sound is temporally and spectrally coded by the auditory brainstem. Spee...

Email:
Password:

Email:

Cortical Representations of Speech in a Multi-talker Auditory Scene

Related Results