Javascript must be enabled to continue!
Shrink and Eliminate: A Study of Post-Training Quantization and Repeated Operations Elimination in RNN Models
View through CrossRef
Recurrent neural networks (RNNs) are neural networks (NN) designed for time-series applications. There is a growing interest in running RNNs to support these applications on edge devices. However, RNNs have large memory and computational demands that make them challenging to implement on edge devices. Quantization is used to shrink the size and the computational needs of such models by decreasing weights and activation precision. Further, the delta networks method increases the sparsity in activation vectors by relying on the temporal relationship between successive input sequences to eliminate repeated computations and memory accesses. In this paper, we study the effect of quantization on LSTM-, GRU-, LiGRU-, and SRU-based RNN models for speech recognition on the TIMIT dataset. We show how to apply post-training quantization on these models with a minimal increase in the error by skipping quantization of selected paths. In addition, we show that the quantization of activation vectors in RNNs to integer precision leads to considerable sparsity if the delta networks method is applied. Then, we propose a method for increasing the sparsity in the activation vectors while minimizing the error and maximizing the percentage of eliminated computations. The proposed quantization method managed to compress the four models more than 85%, with an error increase of 0.6, 0, 2.1, and 0.2 percentage points, respectively. By applying the delta networks method to the quantized models, more than 50% of the operations can be eliminated, in most cases with only a minor increase in the error. Comparing the four models to each other under the quantization and delta networks method, we found that compressed LSTM-based models are the most-optimum solutions at low-error-rates constraints. The compressed SRU-based models are the smallest in size, suitable when higher error rates are acceptable, and the compressed LiGRU-based models have the highest number of eliminated operations.
Title: Shrink and Eliminate: A Study of Post-Training Quantization and Repeated Operations Elimination in RNN Models
Description:
Recurrent neural networks (RNNs) are neural networks (NN) designed for time-series applications.
There is a growing interest in running RNNs to support these applications on edge devices.
However, RNNs have large memory and computational demands that make them challenging to implement on edge devices.
Quantization is used to shrink the size and the computational needs of such models by decreasing weights and activation precision.
Further, the delta networks method increases the sparsity in activation vectors by relying on the temporal relationship between successive input sequences to eliminate repeated computations and memory accesses.
In this paper, we study the effect of quantization on LSTM-, GRU-, LiGRU-, and SRU-based RNN models for speech recognition on the TIMIT dataset.
We show how to apply post-training quantization on these models with a minimal increase in the error by skipping quantization of selected paths.
In addition, we show that the quantization of activation vectors in RNNs to integer precision leads to considerable sparsity if the delta networks method is applied.
Then, we propose a method for increasing the sparsity in the activation vectors while minimizing the error and maximizing the percentage of eliminated computations.
The proposed quantization method managed to compress the four models more than 85%, with an error increase of 0.
6, 0, 2.
1, and 0.
2 percentage points, respectively.
By applying the delta networks method to the quantized models, more than 50% of the operations can be eliminated, in most cases with only a minor increase in the error.
Comparing the four models to each other under the quantization and delta networks method, we found that compressed LSTM-based models are the most-optimum solutions at low-error-rates constraints.
The compressed SRU-based models are the smallest in size, suitable when higher error rates are acceptable, and the compressed LiGRU-based models have the highest number of eliminated operations.
Related Results
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED] Keanu Reeves CBD Gummies v1
[RETRACTED]Keanu Reeves CBD Gummies ==❱❱ Huge Discounts:[HURRY UP ] Absolute Keanu Reeves CBD Gummies (Available)Order Online Only!! ❰❰= https://www.facebook.com/Keanu-Reeves-CBD-G...
Energy-efficient architectures for recurrent neural networks
Energy-efficient architectures for recurrent neural networks
Deep Learning algorithms have been remarkably successful in applications such as Automatic Speech Recognition and Machine Translation. Thus, these kinds of applications are ubiquit...
Development of a Recurrent Neural Network Model for Prediction of Dengue Importation
Development of a Recurrent Neural Network Model for Prediction of Dengue Importation
ObjectiveWe aim to develop a prediction model for the number of imported cases of infectious disease by using the recurrent neural network (RNN) with the Elman algorithm1, a type o...
Pelatihan Peramalan Target Indikator Kinerja Daerah
Pelatihan Peramalan Target Indikator Kinerja Daerah
The purpose of this community service in the form of training is to improve the ability of functional planner staff in forecasting indicator targets of regional performance. This t...
Comparative Evaluation of Deep Learning Techniques in Streamflow Monthly Prediction of the Zarrine River Basin
Comparative Evaluation of Deep Learning Techniques in Streamflow Monthly Prediction of the Zarrine River Basin
Predicting monthly streamflow is essential for hydrological analysis and water resource management. Recent advancements in deep learning, particularly long short-term memory (LSTM)...
Progress of shrink polymer micro- and nanomanufacturing
Progress of shrink polymer micro- and nanomanufacturing
AbstractTraditional lithography plays a significant role in the fabrication of micro- and nanostructures. Nevertheless, the fabrication process still suffers from the limitations o...
RNN-LSTM BASED REGULAR HEALTH FACTOR ANALYSIS IN MEDICAL ENVIRONMENT
RNN-LSTM BASED REGULAR HEALTH FACTOR ANALYSIS IN MEDICAL ENVIRONMENT
In an era where fast-paced routines, high stress,
and unhealthy habits have become the norm, modern
society is facing a surge in health problems such as high
blood pressure, diabet...
Predicting lymphatic filariasis elimination in data-limited settings: a reconstructive computational framework for combining data generation and model discovery
Predicting lymphatic filariasis elimination in data-limited settings: a reconstructive computational framework for combining data generation and model discovery
AbstractAlthough there is increasing recognition of the importance of mathematical models in the effective design and management of long-term parasite elimination, it is also becom...

