Javascript must be enabled to continue!

Boosting Speech-to-Text software potential

The article focuses on finding ways of boosting efficiency and accuracy of Speech-to-Text (STT)-powered input. The effort is triggered by the growing popularity of the software among professional translators, which is in line with the general trend of abandoning typing in favor of speech-to-text applications. Insisting that better effectiveness of such programs is contingent on their accuracy, the researchers analyze major factors, both linguistic and technical in nature, affecting the computer-assisted speech transcribing quality. This leads to an experiment, putting the hypothesis to a test. Based on numerical and performance data, errors and their breakdown into categories in an attempt to figure out their origins, it dwells on various approaches to dictation in a combination with several hardware options and configurations. These pave the way for recommendations on the improvement of STT performance based on the Dragon software. The authors arrive at a conclusion that it is possible to boost the STT accuracy up to 99 percent by adjusting the program profile to accommodate phonetic features of the speaker with due consideration of his accent, adding to the dictionary the most complex and rare vocabulary beforehand, and fine-tuning input hardware. Other noteworthy results include ways to overcome the most complex transcribing challenges, i.e. proper names, placenames, abbreviations, etc.

Belgorod National Research University

Andrey R. Biktimirov Dmitry Yu. Gruzdev

RESEARCH RESULT Theoretical and Applied Linguistics

2023

Title: Boosting Speech-to-Text software potential

Description:

The article focuses on finding ways of boosting efficiency and accuracy of Speech-to-Text (STT)-powered input.

The effort is triggered by the growing popularity of the software among professional translators, which is in line with the general trend of abandoning typing in favor of speech-to-text applications.

Insisting that better effectiveness of such programs is contingent on their accuracy, the researchers analyze major factors, both linguistic and technical in nature, affecting the computer-assisted speech transcribing quality.

This leads to an experiment, putting the hypothesis to a test.

Based on numerical and performance data, errors and their breakdown into categories in an attempt to figure out their origins, it dwells on various approaches to dictation in a combination with several hardware options and configurations.

These pave the way for recommendations on the improvement of STT performance based on the Dragon software.

The authors arrive at a conclusion that it is possible to boost the STT accuracy up to 99 percent by adjusting the program profile to accommodate phonetic features of the speaker with due consideration of his accent, adding to the dictionary the most complex and rare vocabulary beforehand, and fine-tuning input hardware.

Other noteworthy results include ways to overcome the most complex transcribing challenges, i.

proper names, placenames, abbreviations, etc.

Back

Related Results

Software Protection

ABSTRACT : Software piracy has been major issue for software industries. Piracy has become so prevalent over the Internet that poses a major threat to software product companies. W...

Perception advantages of foreign directed speech

Foreign directed speech (FDS) is a listener directed speech style used when native speakers interact with non-native listeners of a language. This study considers if native and non...

Data Analytics Software for Automatic Detection of Anomalies in Well Testing

Abstract This paper will present a software that was developed to diagnose well test data. The software monitors the data, and through a series of algorithms alarms ...

Experimental realisation of tunable ferroelectric/superconductor $$({\text {B}} {\text {T}} {\text {O}}/{\text {Y}} {\text {B}}{\text {C}} {\text {O}})_{{\text {N}}}/{\text {S}}{\text {T}}{\text {O}}$$ 1D photonic crystals in the whole visible spectrum

AbstractEmergent technologies that make use of novel materials and quantum properties of light states are at the forefront in the race for the physical implementation, encoding and...

Developmental Links Between Speech Perception in Noise, Singing, and Cortical Processing of Music in Children with Cochlear Implants

The perception of speech in noise is challenging for children with cochlear implants (CIs). Singing and musical instrument playing have been associated with improved auditory skill...

Surrogate Speech of the Asante Ivory Trumpeters of Ghana

Surrogate speech is a phonological system by which word tones of a spoken language are represented in tones produced on a musical instrument. Ethnomusicologists regard this as a mu...

Speech in “Paradise Lost”

ABSTRACT In the sixteenth and seventeenth centuries several treatises (religious, philosophical, and rhetorical) discussed the Fall of Man as involving a corruption ...

Free Software Beyond Radical Politics: Negotiations of Creative and Craft Autonomy in Digital Visual Media Production

Free software development and the technological practices of hackers have been broadly recognised as fundamental for the formation of political cultures that foster democracy in th...

Recent Results

Robe

The distinctive cut of this robe—collarless with a wide neck, the skirt gathered under the arms and flaring over the hips, and a very full, tapering sleeve-- recalls the standard s...

Afternoon dress

cotton, American...

Milking It for All It’s Worth: Unpalatable Practices, Dairy Cows and Veterinary Work?

AbstractViewing animals as a disposable resource is by no means novel, but does milking the cow for all its worth now represent a previously unimaginable level of exploitation? New...

Email:
Password:

Email: