Training Efficiency and Robustness in Deep Learning

Fartash Faghri

arXiv:2112.01423·cs.LG·December 3, 2021

Training Efficiency and Robustness in Deep Learning

Fartash Faghri

PDF

Open Access 1 Repo

TL;DR

This paper explores methods to enhance training efficiency and robustness in deep learning, including data prioritization, optimization improvements, and adversarial robustness strategies, with theoretical and practical insights.

Contribution

It introduces novel techniques like hard negative mining, redundancy-aware sampling, and gradient clustering, and provides theoretical analysis of robustness in linear models.

Findings

01

Prioritizing informative data accelerates convergence and improves generalization.

02

Hard negative mining adds no computational overhead to training.

03

Optimal robustness in linear models depends on choice of optimizer, regularization, or architecture.

Abstract

Deep Learning has revolutionized machine learning and artificial intelligence, achieving superhuman performance in several standard benchmarks. It is well-known that deep learning models are inefficient to train; they learn by processing millions of training data multiple times and require powerful computational resources to process large batches of data in parallel at the same time rather than sequentially. Deep learning models also have unexpected failure modes; they can be fooled into misbehaviour, producing unexpectedly incorrect predictions. In this thesis, we study approaches to improve the training efficiency and robustness of deep learning models. In the context of learning visual-semantic embeddings, we find that prioritizing learning on more informative training data increases convergence speed and improves generalization performance on test data. We formalize a simple trick…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fartashf/vsepp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings