Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks

View through CrossRef
This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance accessibility for deaf and hard of hearing users. Leveraging publicly available sign language and text–to-gloss datasets, the system integrates MediaPipe-based holistic landmark extraction with CNN- and transformer-based architectures to support translation across video, text, and speech modalities within a web-based interface. In the ASL-to-English direction, the sign-to-gloss model achieves a 25.17% word error rate (WER) on the RWTH-PHOENIX-Weather 2014T benchmark, which is competitive with recent continuous sign language recognition systems, and the gloss-level translation attains a ROUGE-L score of 79.89, indicating strong preservation of sign content and ordering. In the reverse English-to-ASL direction, the English-to-Gloss transformer trained on ASLG-PC12 achieves a ROUGE-L score of 96.00, demonstrating high-fidelity gloss sequence generation suitable for landmark-based ASL animation. These results highlight a favorable accuracy-efficiency trade-off achieved through compact model architectures and low-latency decoding, supporting practical real-time deployment.
Title: Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
Description:
This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance accessibility for deaf and hard of hearing users.
Leveraging publicly available sign language and text–to-gloss datasets, the system integrates MediaPipe-based holistic landmark extraction with CNN- and transformer-based architectures to support translation across video, text, and speech modalities within a web-based interface.
In the ASL-to-English direction, the sign-to-gloss model achieves a 25.
17% word error rate (WER) on the RWTH-PHOENIX-Weather 2014T benchmark, which is competitive with recent continuous sign language recognition systems, and the gloss-level translation attains a ROUGE-L score of 79.
89, indicating strong preservation of sign content and ordering.
In the reverse English-to-ASL direction, the English-to-Gloss transformer trained on ASLG-PC12 achieves a ROUGE-L score of 96.
00, demonstrating high-fidelity gloss sequence generation suitable for landmark-based ASL animation.
These results highlight a favorable accuracy-efficiency trade-off achieved through compact model architectures and low-latency decoding, supporting practical real-time deployment.

Related Results

Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...
Yes, No, Visibility, and Variation in ASL and Tactile ASL
Yes, No, Visibility, and Variation in ASL and Tactile ASL
In American Sign Language (ASL), a receiver watches the signer and receives language visually. In contrast, when using tactile ASL, a variety of ASL, the deaf-blind receiver receiv...
Automatic Load Sharing of Transformer
Automatic Load Sharing of Transformer
Transformer plays a major role in the power system. It works 24 hours a day and provides power to the load. The transformer is excessive full, its windings are overheated which lea...
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
This study aims to develop a system for translating American Sign Language (ASL) to and from English, enhancing accessibility for ASL users. We leveraged a publicly available datas...
The First Insight into the Hereditary Fusion Gene Landscape of Amyotrophic Lateral Sclerosis
The First Insight into the Hereditary Fusion Gene Landscape of Amyotrophic Lateral Sclerosis
Abstract Amyotrophic lateral sclerosis (ALS) is a progressive nervous system disease that causes loss of muscle control. Over 30 mutated genes ar...
High frequency modeling of power transformers under transients
High frequency modeling of power transformers under transients
This thesis presents the results related to high frequency modeling of power transformers. First, a 25kVA distribution transformer under lightning surges is tested in the laborator...
Aspects of Rhythm in ASL
Aspects of Rhythm in ASL
The fluent production of American Sign Language (ASL), like speech involves highly skilled, complex motor activity. Thus, like all skilled motor acts, it is rhythmically structured...
570-P: Designing the Deaf Diabetes Can Together Intervention
570-P: Designing the Deaf Diabetes Can Together Intervention
Deaf individuals who communicate using American Sign Language (ASL) are 3 times more likely to have diabetes than hearing people, yet experience challenges obtaining diabetes educa...

Back to Top