Javascript must be enabled to continue!
Iterative convergent computation may not be a useful inductive bias for residual neural networks
View through CrossRef
Abstract
Recent work has suggested that feedforward residual neural networks (ResNets) approximate iterative recurrent computations. Iterative computations are useful in many domains, so they might provide good solutions for neural networks to learn. Here we quantify the degree to which ResNets learn iterative solutions and introduce a regularization approach that encourages learning of iterative solutions. Iterative methods are characterized by two properties: iteration and convergence. To quantify these properties, we define three indices of iterative convergence. Consistent with previous work, we show that, even though ResNets can express iterative solutions, they do not learn them when trained conventionally on computer vision tasks. We then introduce regularizations to encourage iterative convergent computation and test whether this provides a useful inductive bias. To make the networks more iterative, we manipulate the degree of weight sharing across layers using soft gradient coupling. This new method provides a form of recurrence regularization and can interpolate smoothly between an ordinary ResNet and a “recurrent” ResNet (i.e., one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time). To make the networks more convergent we impose a Lipschitz constraint on the residual functions using spectral normalization. The three indices of iterative convergence reveal that the gradient coupling and the Lipschitz constraint succeed at making the networks iterative and convergent, respectively. However, neither recurrence regularization nor spectral normalization improve classification accuracy on standard visual recognition tasks (MNIST, CIFAR-10, CIFAR-100) or on challenging recognition tasks with partial occlusions (Digitclutter). Iterative convergent computation, in these tasks, does not provide a useful inductive bias for ResNets.
Title: Iterative convergent computation may not be a useful inductive bias for residual neural networks
Description:
Abstract
Recent work has suggested that feedforward residual neural networks (ResNets) approximate iterative recurrent computations.
Iterative computations are useful in many domains, so they might provide good solutions for neural networks to learn.
Here we quantify the degree to which ResNets learn iterative solutions and introduce a regularization approach that encourages learning of iterative solutions.
Iterative methods are characterized by two properties: iteration and convergence.
To quantify these properties, we define three indices of iterative convergence.
Consistent with previous work, we show that, even though ResNets can express iterative solutions, they do not learn them when trained conventionally on computer vision tasks.
We then introduce regularizations to encourage iterative convergent computation and test whether this provides a useful inductive bias.
To make the networks more iterative, we manipulate the degree of weight sharing across layers using soft gradient coupling.
This new method provides a form of recurrence regularization and can interpolate smoothly between an ordinary ResNet and a “recurrent” ResNet (i.
e.
, one that uses identical weights across layers and thus could be physically implemented with a recurrent network computing the successive stages iteratively across time).
To make the networks more convergent we impose a Lipschitz constraint on the residual functions using spectral normalization.
The three indices of iterative convergence reveal that the gradient coupling and the Lipschitz constraint succeed at making the networks iterative and convergent, respectively.
However, neither recurrence regularization nor spectral normalization improve classification accuracy on standard visual recognition tasks (MNIST, CIFAR-10, CIFAR-100) or on challenging recognition tasks with partial occlusions (Digitclutter).
Iterative convergent computation, in these tasks, does not provide a useful inductive bias for ResNets.
Related Results
On the role of network dynamics for information processing in artificial and biological neural networks
On the role of network dynamics for information processing in artificial and biological neural networks
Understanding how interactions in complex systems give rise to various collective behaviours has been of interest for researchers across a wide range of fields. However, despite ma...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
XNF-ATc3 affects neural convergent extension
XNF-ATc3 affects neural convergent extension
Convergent extension is the primary driving force elongating the anteroposterior body axis. In Xenopus, convergent extension occurs in the dorsal mesoderm and posterior neural ecto...
On iterative methods to solve nonlinear equations
On iterative methods to solve nonlinear equations
Many of the problems in experimental sciences and other disciplines can be expressed in the form of nonlinear equations. The solution of these equations is rarely obtained in close...
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
Tropical Indian Ocean Mixed Layer Bias in CMIP6 CGCMs Primarily Attributed tothe AGCM Surface Wind Bias
The relatively weak sea surface temperature bias in the tropical Indian Ocean (TIO) simulated in the coupledgeneral circulation model (CGCM) from the recently released CMIP6 has be...
Convergent transcriptomic and genomic adaptation in xeric rodents
Convergent transcriptomic and genomic adaptation in xeric rodents
ABSTRACTRepeated adaptations rely in part on convergent genetic changes. The extent of convergent changes at the genomic scale is debated and may depend on the interplay between di...
CCRRSleepNet: A Hybrid Relational Inductive Biases Network for Automatic Sleep Stage Classification on Raw Single-Channel EEG
CCRRSleepNet: A Hybrid Relational Inductive Biases Network for Automatic Sleep Stage Classification on Raw Single-Channel EEG
In the inference process of existing deep learning models, it is usually necessary to process the input data level-wise, and impose a corresponding relational inductive bias on eac...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...

