Neural Betwork Robustness refers to the ability of a neural network to maintain its performance and generalization capability in the presence of various perturbations or adversarial inputs. A robust neural network exhibits resilience against changes, uncertainties, or intentional distortions in the input data, ensuring that its predictions remain accurate and reliable. Achieving robustness in neural networks is an ongoing research challenge, especially as models become more complex and are deployed in safety-critical applications. Researchers and practitioners aim to develop techniques that enhance the reliability and trustworthiness of neural network predictions in various real-world scenarios.


Improving neural network robustness through neighborhood preserving layers

One major source of vulnerability of neural nets in classification tasks is from overparameterized fully connected layers near the end of the network. In this paper, we propose a new neighborhood preserving layer which can replace these fully connected layers to improve the network robustness. Networks including these neighborhood preserving layers can be trained efficiently. We theoretically prove that our proposed layers are more robust against distortion because they effectively control the magnitude of gradients. Finally, we empirically show that networks with our proposed layers are more robust against state-of-the-art gradient descent-based attacks, such as a PGD attack on the benchmark image classification datasets MNIST and CIFAR10.