Javascript must be enabled to continue!

Automatic lipreading using convolutional neural networks and orthogonal moments

Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks. In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN) in order to develop a lipreading system based only on video images. This approach is based on Quaternion Meixner Moments (QMMs) that we use as a filter in the Convolutional Neural Networks (CNN) architecture. In addition, we use the Grey Wolf optimization algorithm (GWO) with the aim of ensuring high accuracy of classification through the optimization of the Quaternion Meixner Moments (QMMs) filter local parameters. We show that this method is an effective solution to decrease the high dimensionality of the video images and the training time. This approach is tested on a public dataset and compared to different methods that use complex models and deep architecture in the literature.

Lviv Polytechnic National University

Y. Ait Khayi O. El Ogri J. El-Mekkaoui M. Benslimane A. Hjouji

Mathematical Modeling and Computing

2025

Title: Automatic lipreading using convolutional neural networks and orthogonal moments

Description:

Recently, understanding speech from a speaker's mouth using only visual interpretation of the lips movement has become one of the most complex computer vision tasks.

In the present paper, we suggest a new approach named Optimized Quaternion Meixner Moments Convolutional Neural Networks (OQMMCNN) in order to develop a lipreading system based only on video images.

This approach is based on Quaternion Meixner Moments (QMMs) that we use as a filter in the Convolutional Neural Networks (CNN) architecture.

In addition, we use the Grey Wolf optimization algorithm (GWO) with the aim of ensuring high accuracy of classification through the optimization of the Quaternion Meixner Moments (QMMs) filter local parameters.

We show that this method is an effective solution to decrease the high dimensionality of the video images and the training time.

This approach is tested on a public dataset and compared to different methods that use complex models and deep architecture in the literature.

Back

Abstract Lipreading is the skill of visually analyzing lip movements and facial cues to understand spoken language. This valuable skill finds application in assisting indiv...

Graph convolutional neural networks for 3D data analysis

(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...

Sensory integration of speech by a profoundly deaf subject using tactile aids

Previous research on tactual speech perception has focused on the relative contributions of lipreading and taction with normally hearing subjects. The integration of information fr...

Typical lipreading and audiovisual speech perception without motor simulation

ABSTRACT All it takes is a face to face conversation in a noisy environment to realize that viewing a speaker’s lip movements contributes to speech comprehension. F...

Fuzzy Chaotic Neural Networks

An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...

On the role of network dynamics for information processing in artificial and biological neural networks

Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...

Using local convolutional neural networks for genomic prediction

ABSTRACT The prediction of breeding values and phenotypes is of central importance for both livestock and crop breeding. With increasing computational power and mor...

Memorization capacity and robustness of neural networks

Machine learning, and deep learning in particular, has recently undergone rapid advancements. To contribute to a rigorous understanding of deep learning, this thesis explores two d...

Email:
Password:

Email:

Automatic lipreading using convolutional neural networks and orthogonal moments

Related Results