Javascript must be enabled to continue!
Speakerdependent automatic speech recognition in Buryat
View through CrossRef
Due to the dominance of Russian in education, formal communication, and mass media the usage of Buryat is gradually decreasing. Therefore, the problem of language preservation is of high concern for the native people. In
modern conditions, in resolving this question significant consideration is given to speech technologies based on
natural language processing, artificial intelligence and other methods to digitize the Buryat language and provide access to resources online. The analysis of current resources has found no Buryattargeted technologies that work
with audio data. Therefore, the present study covers the development of speech recognition model prototype in
166
Buryat for automatic speech processing. The model uses the DeepSpeech2 architecture allowing to combine the functions of convolutional neural networks to extract and compress acoustic information relevant for recognition,
as well as connections to identify time dependencies between phones in order to develop a model of a word in the Buryat language. The percentage of incorrectly recognized phones is 11%. This value significantly exceeds the values typical for modern models. The limitation of this model is the insufficient amount of training data and the dependence on the speaker's voice. Expanding the speech sample bank, annotating the data and optimizing the model can improve the accuracy of the system and its resistance to speech variability.
Amur State University
Title: Speakerdependent automatic speech recognition in Buryat
Description:
Due to the dominance of Russian in education, formal communication, and mass media the usage of Buryat is gradually decreasing.
Therefore, the problem of language preservation is of high concern for the native people.
In
modern conditions, in resolving this question significant consideration is given to speech technologies based on
natural language processing, artificial intelligence and other methods to digitize the Buryat language and provide access to resources online.
The analysis of current resources has found no Buryattargeted technologies that work
with audio data.
Therefore, the present study covers the development of speech recognition model prototype in
166
Buryat for automatic speech processing.
The model uses the DeepSpeech2 architecture allowing to combine the functions of convolutional neural networks to extract and compress acoustic information relevant for recognition,
as well as connections to identify time dependencies between phones in order to develop a model of a word in the Buryat language.
The percentage of incorrectly recognized phones is 11%.
This value significantly exceeds the values typical for modern models.
The limitation of this model is the insufficient amount of training data and the dependence on the speaker's voice.
Expanding the speech sample bank, annotating the data and optimizing the model can improve the accuracy of the system and its resistance to speech variability.
Related Results
Buryat historical sources: digital infrastructure of machine translation
Buryat historical sources: digital infrastructure of machine translation
The study is dedicated to a vast yet still underexplored corpus of Buryat historical sources in the Old Written Mongolian language, preserved in academic and archival institutions ...
Automatic speech recognition in voice-speech rehabilitation effectiveness evaluation in patients after laryngectomy
Automatic speech recognition in voice-speech rehabilitation effectiveness evaluation in patients after laryngectomy
Introduction.
Lost voice function compensation determines the personal and social life of laryngectomees. Automatic speech recognition and synthesis methods are...
Cattle in Buryat Mythology and Ritual
Cattle in Buryat Mythology and Ritual
This study addresses, on the basis of ethnographic, folkloric, linguistic, and field data, the role of cattle in Buryat myths and rites, with reference to their economic significan...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND
Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
Database of Buryat Genealogies: Major Approaches and Implementation
Database of Buryat Genealogies: Major Approaches and Implementation
The article develops a database of archival sources covering Buryat family trees. Genealogic data were traditionally very important for Buryat society as they were linked to such a...
State of Pituitary-Ovarian Link of the Neuroendocrine Regulation System in Women of Reproductive Age with Ovarian Hyperandrogenism
State of Pituitary-Ovarian Link of the Neuroendocrine Regulation System in Women of Reproductive Age with Ovarian Hyperandrogenism
Background: This study aimed to evaluate the state of the pituitary-ovarian link of the neuroendocrine regulation system in women of reproductive age with OH of the main ethnic gro...
Traditional Buryat Beliefs About Birds
Traditional Buryat Beliefs About Birds
This study, based on ethnographic, linguistic, and folk materials, describes and interprets Buryat ideas of birds. The analysis of lexical data reveals the principal groups of bird...
History of Mahakala Cult in Buryatia
History of Mahakala Cult in Buryatia
The history of the cult of Mahakala in Buryat Buddhism is considered. A short introduction to the history of the deity is presented. It is noted that this is one of the main patron...

