Javascript must be enabled to continue!
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
View through CrossRef
This study aims to develop a system for translating American Sign Language (ASL) to and from English, enhancing accessibility for ASL users. We leveraged a publicly available dataset to train a model that accurately predicts ASL signs and their English translations. The system employs AI-based transformers for bidirectional translation: converting text and speech into ASL using computer vision and translating ASL signs into text. For user accessibility, we built a web-based interface that integrates a computer vision framework (MediaPipe) to detect key body landmarks, including hands, shoulders, and facial features. This enables the system to process text, speech input, and video recordings, which are stored using msgpack and analyzed to generate ASL imagery. Additionally, we are developing a transformer model that is trained jointly on pairs of gloss sequences and sentences using connectionist temporal classification (CTC) and cross-entropy loss. Along with that, we are utilizing an EfficientNet-B0 pretrained on the ImageNet dataset with 1D convolution blocks to extract features from video frames, helping facilitate the conversion of ASL signs into structured English text.
Title: Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
Description:
This study aims to develop a system for translating American Sign Language (ASL) to and from English, enhancing accessibility for ASL users.
We leveraged a publicly available dataset to train a model that accurately predicts ASL signs and their English translations.
The system employs AI-based transformers for bidirectional translation: converting text and speech into ASL using computer vision and translating ASL signs into text.
For user accessibility, we built a web-based interface that integrates a computer vision framework (MediaPipe) to detect key body landmarks, including hands, shoulders, and facial features.
This enables the system to process text, speech input, and video recordings, which are stored using msgpack and analyzed to generate ASL imagery.
Additionally, we are developing a transformer model that is trained jointly on pairs of gloss sequences and sentences using connectionist temporal classification (CTC) and cross-entropy loss.
Along with that, we are utilizing an EfficientNet-B0 pretrained on the ImageNet dataset with 1D convolution blocks to extract features from video frames, helping facilitate the conversion of ASL signs into structured English text.
Related Results
Aviation English - A global perspective: analysis, teaching, assessment
Aviation English - A global perspective: analysis, teaching, assessment
This e-book brings together 13 chapters written by aviation English researchers and practitioners settled in six different countries, representing institutions and universities fro...
Yes, No, Visibility, and Variation in ASL and Tactile ASL
Yes, No, Visibility, and Variation in ASL and Tactile ASL
In American Sign Language (ASL), a receiver watches the signer and receives language visually. In contrast, when using tactile ASL, a variety of ASL, the deaf-blind receiver receiv...
Automatic Load Sharing of Transformer
Automatic Load Sharing of Transformer
Transformer plays a major role in the power system. It works 24 hours a day and provides power to the load. The transformer is excessive full, its windings are overheated which lea...
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance...
The First Insight into the Hereditary Fusion Gene Landscape of Amyotrophic Lateral Sclerosis
The First Insight into the Hereditary Fusion Gene Landscape of Amyotrophic Lateral Sclerosis
Abstract
Amyotrophic lateral sclerosis (ALS) is a progressive nervous system disease that causes loss of muscle control. Over 30 mutated genes ar...
High frequency modeling of power transformers under transients
High frequency modeling of power transformers under transients
This thesis presents the results related to high frequency modeling of power transformers. First, a 25kVA distribution transformer under lightning surges is tested in the laborator...
Aspects of Rhythm in ASL
Aspects of Rhythm in ASL
The fluent production of American Sign Language (ASL), like speech involves highly skilled, complex motor activity. Thus, like all skilled motor acts, it is rhythmically structured...
570-P: Designing the Deaf Diabetes Can Together Intervention
570-P: Designing the Deaf Diabetes Can Together Intervention
Deaf individuals who communicate using American Sign Language (ASL) are 3 times more likely to have diabetes than hearing people, yet experience challenges obtaining diabetes educa...

