Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks

View through CrossRef
The rapidly growing parameter volume of deep neural networks (DNNs) hinders the artificial intelligence applications on resource constrained devices, such as mobile and wearable devices. Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the model size and thus the amount of computation. And thereby, the state-of-the-art DNNs are able to be deployed on those devices with high runtime energy efficiency. In contrast to irregular pruning that incurs high index storage and decoding overhead, structured pruning techniques have been proposed as the promising solutions. However, prior studies on structured pruning tackle the problem mainly from the perspective of facilitating hardware implementation, without diving into the deep to analyze the characteristics of sparse neural networks. The neglect on the study of sparse neural networks causes inefficient trade-off between regularity and pruning ratio. Consequently, the potential of structurally pruning neural networks is not sufficiently mined.In this work, we examine the structural characteristics of the irregularly pruned weight matrices, such as the diverse redundancy of different rows, the sensitivity of different rows to pruning, and the position characteristics of retained weights. By leveraging the gained insights as a guidance, we first propose the novel block-max weight masking (BMWM) method, which can effectively retain the salient weights while imposing high regularity to the weight matrix. As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that can effectively take advantage of the intrinsic characteristics of neural networks, and thereby outperform prior structured pruning work with high pruning ratio and decoding efficiency. Our experimental results show that DARB can achieve 13× to 25× pruning ratio, which are 2.8× to 4.3× improvements than the state-of-the-art counterparts on multiple neural network models and tasks. Moreover, DARB can achieve 14.3× decoding efficiency than block pruning with higher pruning ratio.
Title: DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks
Description:
The rapidly growing parameter volume of deep neural networks (DNNs) hinders the artificial intelligence applications on resource constrained devices, such as mobile and wearable devices.
Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the model size and thus the amount of computation.
And thereby, the state-of-the-art DNNs are able to be deployed on those devices with high runtime energy efficiency.
In contrast to irregular pruning that incurs high index storage and decoding overhead, structured pruning techniques have been proposed as the promising solutions.
However, prior studies on structured pruning tackle the problem mainly from the perspective of facilitating hardware implementation, without diving into the deep to analyze the characteristics of sparse neural networks.
The neglect on the study of sparse neural networks causes inefficient trade-off between regularity and pruning ratio.
Consequently, the potential of structurally pruning neural networks is not sufficiently mined.
In this work, we examine the structural characteristics of the irregularly pruned weight matrices, such as the diverse redundancy of different rows, the sensitivity of different rows to pruning, and the position characteristics of retained weights.
By leveraging the gained insights as a guidance, we first propose the novel block-max weight masking (BMWM) method, which can effectively retain the salient weights while imposing high regularity to the weight matrix.
As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that can effectively take advantage of the intrinsic characteristics of neural networks, and thereby outperform prior structured pruning work with high pruning ratio and decoding efficiency.
Our experimental results show that DARB can achieve 13× to 25× pruning ratio, which are 2.
8× to 4.
3× improvements than the state-of-the-art counterparts on multiple neural network models and tasks.
Moreover, DARB can achieve 14.
3× decoding efficiency than block pruning with higher pruning ratio.

Related Results

Efficient Layer Optimizations for Deep Neural Networks
Efficient Layer Optimizations for Deep Neural Networks
Deep neural networks (DNNs) have technical issues such as long training time as the network size increases. Parameters require significant memory, which may cause migration issues ...
A research on rejuvenation pruning of lavandin (Lavandula x intermedia Emeric ex Loisel.)
A research on rejuvenation pruning of lavandin (Lavandula x intermedia Emeric ex Loisel.)
Objective: The main purpose of the research was investigate whether to be renewed or not without the need for re-planting by rejuvenation pruning to the aged plantations of lavandi...
Advancing Transformer Efficiency with Token Pruning
Advancing Transformer Efficiency with Token Pruning
Transformer-based models have revolutionized natural language processing (NLP), achieving state-of-the-art performance across a wide range of tasks. However, their high computation...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
On the role of network dynamics for information processing in artificial and biological neural networks
On the role of network dynamics for information processing in artificial and biological neural networks
Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...
Effect of pruning on the growth and yield of cucumber (Cucumis sativus L.) Mercy Varieties
Effect of pruning on the growth and yield of cucumber (Cucumis sativus L.) Mercy Varieties
This study aims to determine the effect of pruning on the growth and yield of cucumber variety Mercy. This research was organized using a Randomized Group Design (RAK) with treatme...
Research and application of high-power pruning robot based on RTK positioning and heavy load mountings
Research and application of high-power pruning robot based on RTK positioning and heavy load mountings
Aiming at the problem that short circuit tripping may be caused by the insufficient safe distance between trees and lower phase conductors in high voltage transmission line corrido...
Prospective, Randomized Comparison of Deep or Superficial Cervical Plexus Block for Carotid Endarterectomy Surgery 
Prospective, Randomized Comparison of Deep or Superficial Cervical Plexus Block for Carotid Endarterectomy Surgery 
Background Carotid endarterectomy may be performed under cervical plexus block with local anesthetic supplementation by the surgeon as necessary during surgery. It is u...

Back to Top