Generalization Properties of Retrieval-based Models
Soumya Basu, Ankit Singh Rawat, Manzil Zaheer

TL;DR
This paper provides a theoretical analysis of retrieval-based models, exploring their generalization capabilities through local and global approaches, and demonstrating how local sub-tasks can enhance overall accuracy.
Contribution
It introduces a formal framework for understanding retrieval-based models, analyzing local empirical risk minimization and kernel-based global models for classification.
Findings
Local sub-tasks enable low complexity models to achieve high accuracy.
Explicit local risk minimization improves generalization in retrieval models.
Kernel methods provide a global approach to retrieval-based classification.
Abstract
Many modern high-performing machine learning models such as GPT-3 primarily rely on scaling up models, e.g., transformer networks. Simultaneously, a parallel line of work aims to improve the model performance by augmenting an input instance with other (labeled) instances during inference. Examples of such augmentations include task-specific prompts and similar examples retrieved from the training data by a nonparametric component. Remarkably, retrieval-based methods have enjoyed success on a wide range of problems, ranging from standard natural language processing and vision tasks to protein folding, as demonstrated by many recent efforts, including WebGPT and AlphaFold. Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored. In this paper, we present a formal treatment of retrieval-based models to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Dropout · Layer Normalization · {Dispute@FaQ-s}How to file a dispute with Expedia? · Refunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Residual Connection · Attention Dropout · Linear Warmup With Cosine Annealing
