Javascript must be enabled to continue!
Leveraging Diversity for Privileged Multi-Teacher Knowledge Distillation for Facial Expression Recognition
View through CrossRef
Learning privileged information allows a model to exploit data from additional modalities only available during training. State-of-the-art methods for privileged knowledge distillation (PKD) have been proposed to distill information from a teacher model (that combines different prevalent and privileged modalities) to a student model (without access to privileged modalities). Recently, methods have been proposed to capture and distill the structural information and outperform point-to-point PKD methods. However such PKD methods are primarily confined to learning from a single joint teacher representation, which limits their robustness, accuracy, and ability to learn from diverse sources. Diversity in the feature space leads to higher performance in such a distillation scheme. This paper proposes multi-teacher privileged knowledge distillation, a novel method for diversifying the teacher space by recycling the existing backbone feature representations and aligning them with the multimodal space using lightweight adaptation. A simple yet effective teacher selection mechanism implicitly mitigates the negative transfer and allows distillation from the most accurate teacher at each distillation step. The MT-PKDOT employs a structural similarity KD mechanism based on entropy regularized optimal transport for distillation. An additional constraint is added to the loss function to explicitly align the centroids in the student space. The proposed MT-PKDOT method was validated on Affwild2 and Biovid databases. Results indicate that our proposed method is able to improve over the existing state-of-the-art privileged knowledge distillation methods, and in cases where the alignment is ineffective, MTPKDOT is able to maintain the performance similar to single-teacher methods. It improves the visual-only baseline on the Biovid dataset by 5.8%. On the Affwild2 dataset, the proposed method improves 4% and 5% over the visual-only lower bound for valence and arousal, respectively.The code is made publicly available at: https://github.com/haseebaslam95/MT-PKDOT.
Title: Leveraging Diversity for Privileged Multi-Teacher Knowledge Distillation for Facial Expression Recognition
Description:
Learning privileged information allows a model to exploit data from additional modalities only available during training.
State-of-the-art methods for privileged knowledge distillation (PKD) have been proposed to distill information from a teacher model (that combines different prevalent and privileged modalities) to a student model (without access to privileged modalities).
Recently, methods have been proposed to capture and distill the structural information and outperform point-to-point PKD methods.
However such PKD methods are primarily confined to learning from a single joint teacher representation, which limits their robustness, accuracy, and ability to learn from diverse sources.
Diversity in the feature space leads to higher performance in such a distillation scheme.
This paper proposes multi-teacher privileged knowledge distillation, a novel method for diversifying the teacher space by recycling the existing backbone feature representations and aligning them with the multimodal space using lightweight adaptation.
A simple yet effective teacher selection mechanism implicitly mitigates the negative transfer and allows distillation from the most accurate teacher at each distillation step.
The MT-PKDOT employs a structural similarity KD mechanism based on entropy regularized optimal transport for distillation.
An additional constraint is added to the loss function to explicitly align the centroids in the student space.
The proposed MT-PKDOT method was validated on Affwild2 and Biovid databases.
Results indicate that our proposed method is able to improve over the existing state-of-the-art privileged knowledge distillation methods, and in cases where the alignment is ineffective, MTPKDOT is able to maintain the performance similar to single-teacher methods.
It improves the visual-only baseline on the Biovid dataset by 5.
8%.
On the Affwild2 dataset, the proposed method improves 4% and 5% over the visual-only lower bound for valence and arousal, respectively.
The code is made publicly available at: https://github.
com/haseebaslam95/MT-PKDOT.
Related Results
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND
Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
FACIAL EXPRESSION RECOGNITION FOR THAI SIGN LANGUAGE IMAGE
FACIAL EXPRESSION RECOGNITION FOR THAI SIGN LANGUAGE IMAGE
This thesis presents an algorithm for facial expression recognition to help the interpretation of the meaning of Thai sign language image by adding sentimental information. Facial ...
A Comprehensive Review of Distillation in the Pharmaceutical Industry
A Comprehensive Review of Distillation in the Pharmaceutical Industry
Distillation processes play a pivotal role in the pharmaceutical industry for the purification of active pharmaceutical ingredients (APIs), intermediates, and solvent recovery. Thi...
Percepção da Estética Facial em Relação ao Tratamento Ortodôntico: Revisão de Literatura
Percepção da Estética Facial em Relação ao Tratamento Ortodôntico: Revisão de Literatura
A preocupação com a percepção dos pacientes em relação à estética facial evidencia uma mudança de paradigma uma vez que durante o planejamento ortodôntico cada vez mais a opinião d...
Analysis of Facial Phenotype Based on Facial Index Classification Using Cone-beam Computer Tomography in the Saudi Population
Analysis of Facial Phenotype Based on Facial Index Classification Using Cone-beam Computer Tomography in the Saudi Population
Aim: To provide normative values of facial height, width, and facial index, and determine the distribution of facial phenotypes among adults in Saudi Arabia.
Methods: The sample c...
Rehabilitation Surgery for Peripheral Facial Nerve Injury after Facial Trauma
Rehabilitation Surgery for Peripheral Facial Nerve Injury after Facial Trauma
Abstract
Introduction Facial trauma can cause damage to the facial nerve, which can have negative effects on function, aesthetics, and quality of life if left untreated.
...
Analysis of emotion expression on frontal and profile facial images
Analysis of emotion expression on frontal and profile facial images
Expressions of emotions are often found in facial images. In addition to the neutral facial expression, we know six basic expressions of emotion: joy, anger, sadness, fear, surpris...
International Perspectives on Standards and Benchmarking in Teacher Education
International Perspectives on Standards and Benchmarking in Teacher Education
Ensuring quality teachers and quality teacher education programmes have been fundamental global concerns over the decades. High quality teachers are critical to the future developm...

