Training for Fast Sequential Prediction Using Dynamic Feature Selection
Emma Strubell, Luke Vilnis, Andrew McCallum

TL;DR
This paper introduces a dynamic feature selection method that accelerates NLP classifiers by ordering features to achieve high confidence with fewer computations, significantly reducing runtime while maintaining high accuracy.
Contribution
It proposes paired learning and inference algorithms for dynamic feature selection, optimizing feature order to speed up classifiers without sacrificing accuracy.
Findings
Achieves over five-fold reduction in runtime for POS tagging.
Maintains accuracy above 97% despite reduced computation.
Demonstrates effectiveness on WSJ dataset.
Abstract
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning the features into a sequence of templates which are ordered such that high confidence can often be reached using only a small fraction of all features. Parameter estimation is arranged to maximize accuracy and early confidence in this sequence. We present experiments in left-to-right part-of-speech tagging on WSJ, demonstrating that we can preserve accuracy above 97% with over a five-fold reduction in run-time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Natural Language Processing Techniques · Topic Modeling
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
