Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Memorization capacity and robustness of neural networks

View through CrossRef
Machine learning, and deep learning in particular, has recently undergone rapid advancements. To contribute to a rigorous understanding of deep learning, this thesis explores two different research directions. In the first part, we study the interpolation properties of neural networks. Modern practice has demonstrated that neural networks can exhibit remarkable performance when they possess a very large number of parameters. The beginning of this benign over-parameterized regime of model size is marked by the interpolation threshold, where the model is just large enough to interpolate its training data. While estimates based on the memorization capacity of neural networks—which requires the network to be able to interpolate any arrangement of a certain number of samples with arbitrary labeling—can be used to estimate the interpolation threshold, they turn out to be overly pessimistic. Therefore, we analyze interpolation on a given data set directly and prove upper bounds on the required size of a neural network that interpolates this data. In particular, we analyze fully-connected threshold networks, then extend our analysis to translation-invariant convolutional threshold neural networks, an architecture commonly used in image processing. Our approach is constructive in that we introduce a randomized algorithm that, given the data set as an input, with high probability constructs an interpolating neural network. We link the required number of parameters to a ‘problem-adaptive’ measure of mutual complexity of the data classes, which depends on their geometric properties as well as their mutual arrangement and, in the case of convolutional networks, also takes the translation-invariant structure of the model into account. This results in guarantees that are independent of the number of samples, unlike bounds based on memorization capacity. In the second part, we study the Lipschitz constants of random neural networks. In practice, neural networks can be susceptible to certain kinds of adversarial attacks, for instance, adversarial examples. These are inputs to which a small perturbation, possibly imperceptible to the human eye, is added in order to alter the output of the network. The Lipschitz constant of a neural network—which measures the influence of an input perturbation on the output of the network over the whole domain—can be used as a worst-case measure of how vulnerable a neural network is to adversarial examples. We analyze, for any $p \in [1, \infty]$, the $\ell_p$-Lipschitz constants of deep random neural networks with ReLU activations. The weights are drawn independently at random according to the popular He initialization, which naturally connects these networks to the random initializations used before training. For the biases, we only make the mild assumption that their entries are independently drawn from a symmetric distribution with a bounded probability density function. For networks that have sufficient width, we derive high probability upper and lower bounds on the $\ell_p$-Lipschitz constants that differ by at most a factor logarithmic in the width and linear in the depth of the network. In the special case of shallow networks, we are able to derive matching bounds.
Utrecht University Library
Title: Memorization capacity and robustness of neural networks
Description:
Machine learning, and deep learning in particular, has recently undergone rapid advancements.
To contribute to a rigorous understanding of deep learning, this thesis explores two different research directions.
In the first part, we study the interpolation properties of neural networks.
Modern practice has demonstrated that neural networks can exhibit remarkable performance when they possess a very large number of parameters.
The beginning of this benign over-parameterized regime of model size is marked by the interpolation threshold, where the model is just large enough to interpolate its training data.
While estimates based on the memorization capacity of neural networks—which requires the network to be able to interpolate any arrangement of a certain number of samples with arbitrary labeling—can be used to estimate the interpolation threshold, they turn out to be overly pessimistic.
Therefore, we analyze interpolation on a given data set directly and prove upper bounds on the required size of a neural network that interpolates this data.
In particular, we analyze fully-connected threshold networks, then extend our analysis to translation-invariant convolutional threshold neural networks, an architecture commonly used in image processing.
Our approach is constructive in that we introduce a randomized algorithm that, given the data set as an input, with high probability constructs an interpolating neural network.
We link the required number of parameters to a ‘problem-adaptive’ measure of mutual complexity of the data classes, which depends on their geometric properties as well as their mutual arrangement and, in the case of convolutional networks, also takes the translation-invariant structure of the model into account.
This results in guarantees that are independent of the number of samples, unlike bounds based on memorization capacity.
In the second part, we study the Lipschitz constants of random neural networks.
In practice, neural networks can be susceptible to certain kinds of adversarial attacks, for instance, adversarial examples.
These are inputs to which a small perturbation, possibly imperceptible to the human eye, is added in order to alter the output of the network.
The Lipschitz constant of a neural network—which measures the influence of an input perturbation on the output of the network over the whole domain—can be used as a worst-case measure of how vulnerable a neural network is to adversarial examples.
We analyze, for any $p \in [1, \infty]$, the $\ell_p$-Lipschitz constants of deep random neural networks with ReLU activations.
The weights are drawn independently at random according to the popular He initialization, which naturally connects these networks to the random initializations used before training.
For the biases, we only make the mild assumption that their entries are independently drawn from a symmetric distribution with a bounded probability density function.
For networks that have sufficient width, we derive high probability upper and lower bounds on the $\ell_p$-Lipschitz constants that differ by at most a factor logarithmic in the width and linear in the depth of the network.
In the special case of shallow networks, we are able to derive matching bounds.

Related Results

PENERAPAN METODE TAKRIR TERHADAP KEMAMPUAN HAFALAN AL-QUR’AN SANTRI MA’HAD BAITUL QUR’AN MAN TANJUNGPINANG
PENERAPAN METODE TAKRIR TERHADAP KEMAMPUAN HAFALAN AL-QUR’AN SANTRI MA’HAD BAITUL QUR’AN MAN TANJUNGPINANG
In this study, there were 3 issues raised: (1) how to apply the takrir method to the students of Ma'had Baitul Qur'an MAN Tanjungpinang; (2) how is the memorization ability of the ...
The Influence of Quran Understanding on Tahfiz Students Memorization Performance
The Influence of Quran Understanding on Tahfiz Students Memorization Performance
Understanding the Quran is an important requirement of interaction with the Quran in the process of learning the Quran. It is one of the methods to strengthen the memorization of t...
Tradisi Tahfizh al-Qur`an Lansia di Pondok Tahfizh Ma’had An-Nur Sungai Tanang Kec. Banuhampu Kab. Agam
Tradisi Tahfizh al-Qur`an Lansia di Pondok Tahfizh Ma’had An-Nur Sungai Tanang Kec. Banuhampu Kab. Agam
This study aims to analyze the response of elderly Qur'an memorizers to the Qur'an memorization activities at Ma'had An-Nur Islamic Boarding School in Agam Regency. The research fo...
Fuzzy Chaotic Neural Networks
Fuzzy Chaotic Neural Networks
An understanding of the human brain’s local function has improved in recent years. But the cognition of human brain’s working process as a whole is still obscure. Both fuzzy logic ...
Strategi Guru Tahfidz Dalam Meningkatkan Kemampuan Menghafal Al-Qur’an Pada Peserta Didik di MTs Ihyaul Ulum Dukun Gresik
Strategi Guru Tahfidz Dalam Meningkatkan Kemampuan Menghafal Al-Qur’an Pada Peserta Didik di MTs Ihyaul Ulum Dukun Gresik
This research entitled Strategies of Tahfidz Teachers in Improving Students' Ability to Memorize the Qur'an at MTs Ihyaul Ulum Dukun Gresik uses qualitative research. The aim of th...
Impact of Learning Discipline on Students' Qur'an Memorization Achievement
Impact of Learning Discipline on Students' Qur'an Memorization Achievement
This study examines the impact of learning discipline on the achievement of Qur'anic memorization among students in three Islamic boarding schools. Learning discipline is defined a...
Digital Distraction in Quranic Education: A Mixed Methods Approach
Digital Distraction in Quranic Education: A Mixed Methods Approach
The increasing popularity of social media has raised concerns about its potential impact on cognitive tasks, including Quran memorization (hifz). While previous studies have explor...

Back to Top