Javascript must be enabled to continue!
Multistructure-Based Collaborative Online Distillation
View through CrossRef
Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods. However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networks. Increasing network depth or ensembling different networks increases the demand for memory resources and computing resources. This leads to difficulties in deploying depth-learning models in resource-constrained scenarios such as drones, mobile phones, and autonomous driving. Improving network performance without expanding the network scale has become a hot topic for research. In this paper, we propose a cross-architecture online-distillation approach to solve this problem by transmitting supplementary information on different networks. We use the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods. In addition, discontinuous distillation with progressively enhanced constraints is used to replace fixed distillation in order to reduce loss of information diversity in the distillation process. Our training method improves the distillation effect and achieves strong network-performance improvement. We used some popular models to validate the results. On the CIFAR100 dataset, AlexNet’s accuracy was improved by 5.94%, VGG by 2.88%, ResNet by 5.07%, and DenseNet by 1.28%. Extensive experiments were conducted to demonstrate the effectiveness of the proposed method. On the CIFAR10, CIFAR100, and ImageNet datasets, we observed significant improvements over traditional knowledge distillation.
Title: Multistructure-Based Collaborative Online Distillation
Description:
Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods.
However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networks.
Increasing network depth or ensembling different networks increases the demand for memory resources and computing resources.
This leads to difficulties in deploying depth-learning models in resource-constrained scenarios such as drones, mobile phones, and autonomous driving.
Improving network performance without expanding the network scale has become a hot topic for research.
In this paper, we propose a cross-architecture online-distillation approach to solve this problem by transmitting supplementary information on different networks.
We use the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods.
In addition, discontinuous distillation with progressively enhanced constraints is used to replace fixed distillation in order to reduce loss of information diversity in the distillation process.
Our training method improves the distillation effect and achieves strong network-performance improvement.
We used some popular models to validate the results.
On the CIFAR100 dataset, AlexNet’s accuracy was improved by 5.
94%, VGG by 2.
88%, ResNet by 5.
07%, and DenseNet by 1.
28%.
Extensive experiments were conducted to demonstrate the effectiveness of the proposed method.
On the CIFAR10, CIFAR100, and ImageNet datasets, we observed significant improvements over traditional knowledge distillation.
Related Results
A Comprehensive Review of Distillation in the Pharmaceutical Industry
A Comprehensive Review of Distillation in the Pharmaceutical Industry
Distillation processes play a pivotal role in the pharmaceutical industry for the purification of active pharmaceutical ingredients (APIs), intermediates, and solvent recovery. Thi...
How Should College Physical Education (CPE) Conduct Collaborative Governance? A Survey Based on Chinese Colleges
How Should College Physical Education (CPE) Conduct Collaborative Governance? A Survey Based on Chinese Colleges
Background and Aim: College physical education (CPE) is a Key Stage in the transition from school physical education to national sports. Collaborative governance is an effective ne...
Steam Distillation Studies For The Kern River Field
Steam Distillation Studies For The Kern River Field
Abstract
The interactions of heavy oil and injected steam in the mature steamflood at the Kern River Field have been extensively studied to gain insight into the ...
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic
Initial Experience with Pediatrics Online Learning for Nonclinical Medical Students During the COVID-19 Pandemic
Abstract
Background: To minimize the risk of infection during the COVID-19 pandemic, the learning mode of universities in China has been adjusted, and the online learning o...
Combined Knowledge Distillation Framework: Breaking Down Knowledge Barriers
Combined Knowledge Distillation Framework: Breaking Down Knowledge Barriers
<p>Knowledge distillation, one of the most prominent methods in model compression, has successfully balanced small model sizes and high performance. However, it has been obse...
Research Status and Development Trend of Multi-arm Collaborative Robots
Research Status and Development Trend of Multi-arm Collaborative Robots
Industrial robots are mainly used in metal forming, automotive, and electrical and electronics
industries. After decades of unremitting efforts, industrial robots have achieved gre...
Design of integrated real time optimization and model predictive control for distillation column
Design of integrated real time optimization and model predictive control for distillation column
To present the design of the integrated real time optimization (RTO) and model predictive control (MPC) with application to distillation column. The integration of the RTO and MPC ...
Knowledge distillation in deep learning and its applications
Knowledge distillation in deep learning and its applications
Deep learning based models are relatively large, and it is hard to deploy such models on resource-limited devices such as mobile phones and embedded devices. One possible solution ...

