Javascript must be enabled to continue!

DNN-based Multilingual Acoustic Modeling for Four Ethiopian Languages

In this paper, we present the results of experiments conducted on multilingual acoustic modeling in the development of an Automatic Speech Recognition (ASR) system using speech data of phonetically much related Ethiopian languages (Amharic, Tigrigna, Oromo and Wolaytta) with multilingual (ML) mix and multitask approaches. The use of speech data from only phonetically much related languages brought improvement over results reported in a previous work that used 26 languages (including the four languages). A maximum Word Error Rate (WER) reduction from 25.03% (in the previous work) to 21.52% has been achieved for Wolaytta, which is a relative WER reduction of 14.02%. As a result of using multilingual acoustic modeling for the development of an automatic speech recognition (ASR) system, a relative WER reduction of up to 7.36% (a WER reduction from 23.23% to 21.52%) has been achieved over a monolingual ASR. Compared to the ML mix, the multitask approach brought a better performance improvement (a relative WERs reduction of up to 5.9%). Experiments have also been conducted using Amharic and Tigrigna in a pair and Oromo and Wolaytta in another pair. The results of the experiments showed that languages with a relatively better language resources for lexical and language modeling (Amharic and Tigrigna) benefited from the use of speech data from only two languages. Generally, the findings show that the use of speech corpora of phonetically related languages with the multitask multilingual modeling approach for the development of ASR systems for less-resourced languages is a promising solution.

African Journals Online (AJOL)

Solomon Teferra Martha Yifiru Tanja Schultz

SINET: Ethiopian Journal of Science

2024

Title: DNN-based Multilingual Acoustic Modeling for Four Ethiopian Languages

Description:

The use of speech data from only phonetically much related languages brought improvement over results reported in a previous work that used 26 languages (including the four languages).

A maximum Word Error Rate (WER) reduction from 25.

03% (in the previous work) to 21.

52% has been achieved for Wolaytta, which is a relative WER reduction of 14.

02%.

As a result of using multilingual acoustic modeling for the development of an automatic speech recognition (ASR) system, a relative WER reduction of up to 7.

36% (a WER reduction from 23.

23% to 21.

52%) has been achieved over a monolingual ASR.

Compared to the ML mix, the multitask approach brought a better performance improvement (a relative WERs reduction of up to 5.

9%).

Experiments have also been conducted using Amharic and Tigrigna in a pair and Oromo and Wolaytta in another pair.

The results of the experiments showed that languages with a relatively better language resources for lexical and language modeling (Amharic and Tigrigna) benefited from the use of speech data from only two languages.

Generally, the findings show that the use of speech corpora of phonetically related languages with the multitask multilingual modeling approach for the development of ASR systems for less-resourced languages is a promising solution.

Back

Background and Objectives: As English is a stress-timed language, lexical stress plays an important role in the perception and processing of speech by native speakers. Incorrect st...

Multi-Model Ensemble Depth Adaptive Deep Neural Network for Crop Yield Prediction

Accurate prediction of crop yield enables critical tasks such as identifying the optimum crop profile for planting, assigning government resources and decision-making on imports an...

EFFECT OF BILINGUAL INSTRUCTIONAL METHOD IN THE ACADEMIC ACHIEVEMENT OF JUNIOR SECONDARY SCHOOL STUDENTS IN MATHEMATICS

The importance of mathematics in the modern society is overwhelming. The importance of mathematics has long been recognized all over the world, and that is why all students are req...

Moving towards (new) multilingual paradigms

Abstract Multilingual education is increasingly perceived as a desirable goal in a world where global networks play a significant role. Crucially, educating multilin...

Predict the Risk of Dyslipidemia via Deep Neural Networks for Survival Data

Abstract Background: Dyslipidemia is an important risk factor for coronary artery disease and stroke. Early detection and prevention of dyslipidemia can markedly alter card...

Kra-Dai Languages

Kra-Dai (also called Tai-Kadai and Kam-Tai) is a family of approximately 100 languages spoken in Southeast Asia, extending from the island of Hainan, China, in the east to the Indi...

Reducing Hallucination in Multilingual Voice Agents Using Instruction-Tuned Models

In highly applied multilingual voice agents of customer service and interactive AI systems in the world, one persistent problem constantly haunts the industry/field: hallucinations...

Acoustic cloaking design based on penetration manipulation with combination acoustic metamaterials

The acoustic wave transmission manipulation ability is the most important performance for the acoustic metamaterials. To manipulate the acoustic transmission, the combination acous...

Email:
Password:

Email:

DNN-based Multilingual Acoustic Modeling for Four Ethiopian Languages

Related Results