Javascript must be enabled to continue!

Why are listeners hindered by talker variability?

View through CrossRef

AbstractThough listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.

Springer Science and Business Media LLC

Psychonomic Bulletin & Review

Title: Why are listeners hindered by talker variability?

Description:

AbstractThough listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant.

This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties.

One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning).

An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention.

Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales.

Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs.

The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.

Related Results

Xie, Liu, & Jaeger (2020). Cross-talker generalization during foreign-accented speech perception

Speech perception depends on the ability to generalize previously experienced input effectively across talkers. How such cross-talker generalization is achieved has remained an ope...

Recognizing voices through a cochlear implant: A systematic review

Objective: Some cochlear implant (CI) users report having difficulty accessing indexical information in the speech signal, presumably due to the transformation from acoustic to ele...

Talker and accent familiarity yield advantages for voice identity perception: a voice sorting study

Familiarity benefits in voice identity perception have been frequently described in the literature. Typically, studies have contrasted listeners who were either familiar or unfamil...

Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification

This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers (“turn-taking” listening scenario). Pr...

Talker variability facilitates the statistical learning of speech sounds

Natural speech contains many sources of acoustic variability both within and between talkers, which challenges speech recognition in some contexts but may facilitate language under...

Listeners are initially flexible in updating phonetic beliefs over time: A replication and replacement of Saltzman and Myers (2018)

Perceptual learning serves as a mechanism for listeners to adapt to novel phonetic information. Distributional tracking theories posit that this adaptation occurs as a result of li...

Effects of talker uncertainty I: Auditory word recognition

The production and resulting acoustic composition of spoken words vary as functions of individual talker characteristics. However, the effects of talker differences on auditory wor...

Neural Speech-Tracking During Selective Attention: A Spatially Realistic Audiovisual Study

Abstract Paying attention to a target talker in multi-talker scenarios is associated with its more accurate neural-tracking relative to competing non-target speech....