Javascript must be enabled to continue!

Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks

This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance accessibility for deaf and hard of hearing users. Leveraging publicly available sign language and text–to-gloss datasets, the system integrates MediaPipe-based holistic landmark extraction with CNN- and transformer-based architectures to support translation across video, text, and speech modalities within a web-based interface. In the ASL-to-English direction, the sign-to-gloss model achieves a 25.17% word error rate (WER) on the RWTH-PHOENIX-Weather 2014T benchmark, which is competitive with recent continuous sign language recognition systems, and the gloss-level translation attains a ROUGE-L score of 79.89, indicating strong preservation of sign content and ordering. In the reverse English-to-ASL direction, the English-to-Gloss transformer trained on ASLG-PC12 achieves a ROUGE-L score of 96.00, demonstrating high-fidelity gloss sequence generation suitable for landmark-based ASL animation. These results highlight a favorable accuracy-efficiency trade-off achieved through compact model architectures and low-latency decoding, supporting practical real-time deployment.

MDPI AG

Stefanie Amiruzzaman Md Amiruzzaman Raga Mouni Batchu James Dracup Alexander Pham Benjamin Crocker Linh Ngo M. Ali Akber Dewan

Computers

2026

Title: Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks

Description:

Leveraging publicly available sign language and text–to-gloss datasets, the system integrates MediaPipe-based holistic landmark extraction with CNN- and transformer-based architectures to support translation across video, text, and speech modalities within a web-based interface.

In the ASL-to-English direction, the sign-to-gloss model achieves a 25.

17% word error rate (WER) on the RWTH-PHOENIX-Weather 2014T benchmark, which is competitive with recent continuous sign language recognition systems, and the gloss-level translation attains a ROUGE-L score of 79.

89, indicating strong preservation of sign content and ordering.

In the reverse English-to-ASL direction, the English-to-Gloss transformer trained on ASLG-PC12 achieves a ROUGE-L score of 96.

00, demonstrating high-fidelity gloss sequence generation suitable for landmark-based ASL animation.

These results highlight a favorable accuracy-efficiency trade-off achieved through compact model architectures and low-latency decoding, supporting practical real-time deployment.

Back

In American Sign Language (ASL), a receiver watches the signer and receives language visually. In contrast, when using tactile ASL, a variety of ASL, the deaf-blind receiver receiv...

Aviation English - A global perspective: analysis, teaching, assessment

This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...

Automatic Load Sharing of Transformer

Transformer plays a major role in the power system. It works 24 hours a day and provides power to the load. The transformer is excessive full, its windings are overheated which lea...

Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks

This study aims to develop a system for translating American Sign Language (ASL) to and from English, enhancing accessibility for ASL users. We leveraged a publicly available datas...

The First Insight into the Hereditary Fusion Gene Landscape of Amyotrophic Lateral Sclerosis

Abstract Amyotrophic lateral sclerosis (ALS) is a progressive nervous system disease that causes loss of muscle control. Over 30 mutated genes ar...

High frequency modeling of power transformers under transients

This thesis presents the results related to high frequency modeling of power transformers. First, a 25kVA distribution transformer under lightning surges is tested in the laborator...

Aspects of Rhythm in ASL

The fluent production of American Sign Language (ASL), like speech involves highly skilled, complex motor activity. Thus, like all skilled motor acts, it is rhythmically structured...

570-P: Designing the Deaf Diabetes Can Together Intervention

Deaf individuals who communicate using American Sign Language (ASL) are 3 times more likely to have diabetes than hearing people, yet experience challenges obtaining diabetes educa...

Email:
Password:

Email:

Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks

Related Results