Focus of Attention for Linear Predictors
Raphael Pelossof, Zhiliang Ying

TL;DR
This paper introduces an attention mechanism for linear predictors that stops evaluation early on easy examples, significantly reducing computation while maintaining accuracy, demonstrated on multiple datasets.
Contribution
It proposes a novel early stopping method for linear predictors that acts as an attention mechanism, improving efficiency by focusing computation on hard examples.
Findings
Reduces average features evaluated to O(sqrt(n log 1/√delta))
Achieves substantial computational gains on large datasets
Maintains prediction accuracy with early stopping
Abstract
We present a method to stop the evaluation of a prediction process when the result of the full evaluation is obvious. This trait is highly desirable in prediction tasks where a predictor evaluates all its features for every example in large datasets. We observe that some examples are easier to classify than others, a phenomenon which is characterized by the event when most of the features agree on the class of an example. By stopping the feature evaluation when encountering an easy- to-classify example, the predictor can achieve substantial gains in computation. Our method provides a natural attention mechanism for linear predictors where the predictor concentrates most of its computation on hard-to-classify examples and quickly discards easy-to-classify ones. By modifying a linear prediction algorithm such as an SVM or AdaBoost to include our attentive method we prove that the average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Machine Learning and Algorithms
