Javascript must be enabled to continue!
Why are listeners hindered by talker variability?
View through CrossRef
AbstractThough listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.
Title: Why are listeners hindered by talker variability?
Description:
AbstractThough listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant.
This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties.
One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning).
An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention.
Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales.
Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs.
The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.
Related Results
Xie, Liu, & Jaeger (2020). Cross-talker generalization during foreign-accented speech perception
Xie, Liu, & Jaeger (2020). Cross-talker generalization during foreign-accented speech perception
Speech perception depends on the ability to generalize previously experienced input effectively across talkers. How such cross-talker generalization is achieved has remained an ope...
Recognizing voices through a cochlear implant: A systematic review
Recognizing voices through a cochlear implant: A systematic review
Objective: Some cochlear implant (CI) users report having difficulty accessing indexical information in the speech signal, presumably due to the transformation from acoustic to ele...
Talker and accent familiarity yield advantages for voice identity perception: a voice sorting study
Talker and accent familiarity yield advantages for voice identity perception: a voice sorting study
Familiarity benefits in voice identity perception have been frequently described in the literature. Typically, studies have contrasted listeners who were either familiar or unfamil...
Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification
Effects of Spatial Speech Presentation on Listener Response Strategy for Talker-Identification
This study investigates effects of spatial auditory cues on human listeners' response strategy for identifying two alternately active talkers (“turn-taking” listening scenario). Pr...
Talker variability facilitates the statistical learning of speech sounds
Talker variability facilitates the statistical learning of speech sounds
Natural speech contains many sources of acoustic variability both within and between talkers, which challenges speech recognition in some contexts but may facilitate language under...
Listeners are initially flexible in updating phonetic beliefs over time: A replication and replacement of Saltzman and Myers (2018)
Listeners are initially flexible in updating phonetic beliefs over time: A replication and replacement of Saltzman and Myers (2018)
Perceptual learning serves as a mechanism for listeners to adapt to novel phonetic information. Distributional tracking theories posit that this adaptation occurs as a result of li...
Effects of talker uncertainty I: Auditory word recognition
Effects of talker uncertainty I: Auditory word recognition
The production and resulting acoustic composition of spoken words vary as functions of individual talker characteristics. However, the effects of talker differences on auditory wor...
Neural Speech-Tracking During Selective Attention: A Spatially Realistic Audiovisual Study
Neural Speech-Tracking During Selective Attention: A Spatially Realistic Audiovisual Study
Abstract
Paying attention to a target talker in multi-talker scenarios is associated with its more accurate neural-tracking relative to competing non-target speech....

