Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

DNN-based Multilingual Acoustic Modeling for Four Ethiopian Languages

View through CrossRef
In this paper, we present the results of experiments conducted on multilingual acoustic modeling in the development of an Automatic Speech Recognition (ASR) system using speech data of phonetically much related Ethiopian languages (Amharic, Tigrigna, Oromo and Wolaytta) with multilingual (ML) mix and multitask approaches. The use of speech data from only phonetically much related languages brought improvement over results reported in a previous work that used 26 languages (including the four languages). A maximum Word Error Rate (WER) reduction from 25.03% (in the previous work) to 21.52% has been achieved for Wolaytta, which is a relative WER reduction of 14.02%. As a result of using multilingual acoustic modeling for the development of an automatic speech recognition (ASR) system, a relative WER reduction of up to 7.36% (a WER reduction from 23.23% to 21.52%) has been achieved over a monolingual ASR. Compared to the ML mix, the multitask approach brought a better performance improvement (a relative WERs reduction of up to 5.9%). Experiments have also been conducted using Amharic and Tigrigna in a pair and Oromo and Wolaytta in another pair. The results of the experiments showed that languages with a relatively better language resources for lexical and language modeling (Amharic and Tigrigna) benefited from the use of speech data from only two languages. Generally, the findings show that the use of speech corpora of phonetically related languages with the multitask multilingual modeling approach for the development of ASR systems for less-resourced languages is a promising solution.
Title: DNN-based Multilingual Acoustic Modeling for Four Ethiopian Languages
Description:
In this paper, we present the results of experiments conducted on multilingual acoustic modeling in the development of an Automatic Speech Recognition (ASR) system using speech data of phonetically much related Ethiopian languages (Amharic, Tigrigna, Oromo and Wolaytta) with multilingual (ML) mix and multitask approaches.
The use of speech data from only phonetically much related languages brought improvement over results reported in a previous work that used 26 languages (including the four languages).
A maximum Word Error Rate (WER) reduction from 25.
03% (in the previous work) to 21.
52% has been achieved for Wolaytta, which is a relative WER reduction of 14.
02%.
As a result of using multilingual acoustic modeling for the development of an automatic speech recognition (ASR) system, a relative WER reduction of up to 7.
36% (a WER reduction from 23.
23% to 21.
52%) has been achieved over a monolingual ASR.
Compared to the ML mix, the multitask approach brought a better performance improvement (a relative WERs reduction of up to 5.
9%).
Experiments have also been conducted using Amharic and Tigrigna in a pair and Oromo and Wolaytta in another pair.
The results of the experiments showed that languages with a relatively better language resources for lexical and language modeling (Amharic and Tigrigna) benefited from the use of speech data from only two languages.
Generally, the findings show that the use of speech corpora of phonetically related languages with the multitask multilingual modeling approach for the development of ASR systems for less-resourced languages is a promising solution.

Related Results

Classification of Bisyllabic Lexical Stress Patterns Using Deep Neural Networks
Classification of Bisyllabic Lexical Stress Patterns Using Deep Neural Networks
Background and Objectives: As English is a stress-timed language, lexical stress plays an important role in the perception and processing of speech by native speakers. Incorrect st...
Multi-Model Ensemble Depth Adaptive Deep Neural Network for Crop Yield Prediction
Multi-Model Ensemble Depth Adaptive Deep Neural Network for Crop Yield Prediction
Accurate prediction of crop yield enables critical tasks such as identifying the optimum crop profile for planting, assigning government resources and decision-making on imports an...
EFFECT OF BILINGUAL INSTRUCTIONAL METHOD IN THE ACADEMIC ACHIEVEMENT OF JUNIOR SECONDARY SCHOOL STUDENTS IN MATHEMATICS
EFFECT OF BILINGUAL INSTRUCTIONAL METHOD IN THE ACADEMIC ACHIEVEMENT OF JUNIOR SECONDARY SCHOOL STUDENTS IN MATHEMATICS
The importance of mathematics in the modern society is overwhelming. The importance of mathematics has long been recognized all over the world, and that is why all students are req...
Moving towards (new) multilingual paradigms
Moving towards (new) multilingual paradigms
Abstract Multilingual education is increasingly perceived as a desirable goal in a world where global networks play a significant role. Crucially, educating multilin...
Predict the Risk of Dyslipidemia via Deep Neural Networks for Survival Data
Predict the Risk of Dyslipidemia via Deep Neural Networks for Survival Data
Abstract Background: Dyslipidemia is an important risk factor for coronary artery disease and stroke. Early detection and prevention of dyslipidemia can markedly alter card...
Kra-Dai Languages
Kra-Dai Languages
Kra-Dai (also called Tai-Kadai and Kam-Tai) is a family of approximately 100 languages spoken in Southeast Asia, extending from the island of Hainan, China, in the east to the Indi...
Reducing Hallucination in Multilingual Voice Agents Using Instruction-Tuned Models
Reducing Hallucination in Multilingual Voice Agents Using Instruction-Tuned Models
In highly applied multilingual voice agents of customer service and interactive AI systems in the world, one persistent problem constantly haunts the industry/field: hallucinations...
Acoustic cloaking design based on penetration manipulation with combination acoustic metamaterials
Acoustic cloaking design based on penetration manipulation with combination acoustic metamaterials
The acoustic wave transmission manipulation ability is the most important performance for the acoustic metamaterials. To manipulate the acoustic transmission, the combination acous...

Back to Top