Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Boosting Speech-to-Text software potential

View through CrossRef
The article focuses on finding ways of boosting efficiency and accuracy of Speech-to-Text (STT)-powered input. The effort is triggered by the growing popularity of the software among professional translators, which is in line with the general trend of abandoning typing in favor of speech-to-text applications. Insisting that better effectiveness of such programs is contingent on their accuracy, the researchers analyze major factors, both linguistic and technical in nature, affecting the computer-assisted speech transcribing quality. This leads to an experiment, putting the hypothesis to a test. Based on numerical and performance data, errors and their breakdown into categories in an attempt to figure out their origins, it dwells on various approaches to dictation in a combination with several hardware options and configurations. These pave the way for recommendations on the improvement of STT performance based on the Dragon software. The authors arrive at a conclusion that it is possible to boost the STT accuracy up to 99 percent by adjusting the program profile to accommodate phonetic features of the speaker with due consideration of his accent, adding to the dictionary the most complex and rare vocabulary beforehand, and fine-tuning input hardware. Other noteworthy results include ways to overcome the most complex transcribing challenges, i.e. proper names, placenames, abbreviations, etc.
Title: Boosting Speech-to-Text software potential
Description:
The article focuses on finding ways of boosting efficiency and accuracy of Speech-to-Text (STT)-powered input.
The effort is triggered by the growing popularity of the software among professional translators, which is in line with the general trend of abandoning typing in favor of speech-to-text applications.
Insisting that better effectiveness of such programs is contingent on their accuracy, the researchers analyze major factors, both linguistic and technical in nature, affecting the computer-assisted speech transcribing quality.
This leads to an experiment, putting the hypothesis to a test.
Based on numerical and performance data, errors and their breakdown into categories in an attempt to figure out their origins, it dwells on various approaches to dictation in a combination with several hardware options and configurations.
These pave the way for recommendations on the improvement of STT performance based on the Dragon software.
The authors arrive at a conclusion that it is possible to boost the STT accuracy up to 99 percent by adjusting the program profile to accommodate phonetic features of the speaker with due consideration of his accent, adding to the dictionary the most complex and rare vocabulary beforehand, and fine-tuning input hardware.
Other noteworthy results include ways to overcome the most complex transcribing challenges, i.
e.
proper names, placenames, abbreviations, etc.

Related Results

Perception advantages of foreign directed speech
Perception advantages of foreign directed speech
Foreign directed speech (FDS) is a listener directed speech style used when native speakers interact with non-native listeners of a language. This study considers if native and non...
Developmental Links Between Speech Perception in Noise, Singing, and Cortical Processing of Music in Children with Cochlear Implants
Developmental Links Between Speech Perception in Noise, Singing, and Cortical Processing of Music in Children with Cochlear Implants
The perception of speech in noise is challenging for children with cochlear implants (CIs). Singing and musical instrument playing have been associated with improved auditory skill...
Surrogate Speech of the Asante Ivory Trumpeters of Ghana
Surrogate Speech of the Asante Ivory Trumpeters of Ghana
Surrogate speech is a phonological system by which word tones of a spoken language are represented in tones produced on a musical instrument. Ethnomusicologists regard this as a mu...
Free Software Beyond Radical Politics: Negotiations of Creative and Craft Autonomy in Digital Visual Media Production
Free Software Beyond Radical Politics: Negotiations of Creative and Craft Autonomy in Digital Visual Media Production
Free software development and the technological practices of hackers have been broadly recognised as fundamental for the formation of political cultures that foster democracy in th...
Speech in “Paradise Lost”
Speech in “Paradise Lost”
ABSTRACT In the sixteenth and seventeenth centuries several treatises (religious, philosophical, and rhetorical) discussed the Fall of Man as involving a corruption ...
An overview of Microsoft’s Whistler text-to-speech system
An overview of Microsoft’s Whistler text-to-speech system
The data-driven approach can significantly facilitate the process of creating text-to-speech (TTS) systems for a new language, a new voice, or a new style. As such, Whistler TTS en...
In Memoriam: Ralph L. Vanderslice and Gunnar Fant
In Memoriam: Ralph L. Vanderslice and Gunnar Fant
RALPH L. VANDERSLICE, who contributed to many areas of phonetics, died on 24 August 2008, aged 78, in Portland, Oregon. He was born on 2 January 1930 in South Bend, Indiana. He rec...

Back to Top