Javascript must be enabled to continue!
Automatic lipreading using convolutional neural networks and orthogonal moments
View through CrossRef
Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks. In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN) in order to develop a lipreading system based only on video images. This approach is based on Quaternion Meixner Moments (QMMs) that we use as a filter in the Convolutional Neural Networks (CNN) architecture. In addition, we use the Grey Wolf optimization algorithm (GWO) with the aim of ensuring high accuracy of classification through the optimization of the Quaternion Meixner Moments (QMMs) filter local parameters. We show that this method is an effective solution to decrease the high dimensionality of the video images and the training time. This approach is tested on a public dataset and compared to different methods that use complex models and deep architecture in the literature.
Lviv Polytechnic National University
Title: Automatic lipreading using convolutional neural networks and orthogonal moments
Description:
Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks.
In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN) in order to develop a lipreading system based only on video images.
This approach is based on Quaternion Meixner Moments (QMMs) that we use as a filter in the Convolutional Neural Networks (CNN) architecture.
In addition, we use the Grey Wolf optimization algorithm (GWO) with the aim of ensuring high accuracy of classification through the optimization of the Quaternion Meixner Moments (QMMs) filter local parameters.
We show that this method is an effective solution to decrease the high dimensionality of the video images and the training time.
This approach is tested on a public dataset and compared to different methods that use complex models and deep architecture in the literature.
Related Results
Lip2Text: Sentence-Level Lipreading on English Speakers Using the Deep Learning Approach
Lip2Text: Sentence-Level Lipreading on English Speakers Using the Deep Learning Approach
Abstract
Lipreading is the skill of visually analyzing lip movements and facial cues to understand spoken language. This valuable skill finds application in assisting indiv...
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
Sensory integration of speech by a profoundly deaf subject using tactile aids
Sensory integration of speech by a profoundly deaf subject using tactile aids
Previous research on tactual speech perception has focused on the relative contributions of lipreading and taction with normally hearing subjects. The integration of information fr...
Typical lipreading and audiovisual speech perception without motor simulation
Typical lipreading and audiovisual speech perception without motor simulation
ABSTRACT
All it takes is a face to face conversation in a noisy environment to realize that viewing a speaker’s lip movements contributes to speech comprehension. F...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
On the role of network dynamics for information processing in artificial and biological neural networks
On the role of network dynamics for information processing in artificial and biological neural networks
Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...
Using local convolutional neural networks for genomic prediction
Using local convolutional neural networks for genomic prediction
ABSTRACT
The prediction of breeding values and phenotypes is of central importance for both livestock and crop breeding. With increasing computational power and mor...
Memorization capacity and robustness of neural networks
Memorization capacity and robustness of neural networks
Machine learning, and deep learning in particular, has recently undergone rapid advancements. To contribute to a rigorous understanding of deep learning, this thesis explores two d...

