Learning Dynamic Feature Selection for Fast Sequential Prediction
Emma Strubell, Luke Vilnis, Kate Silverstein, Andrew McCallum

TL;DR
This paper introduces a novel feature selection method that orders features to quickly reach high-confidence predictions, significantly speeding up NLP classifiers while maintaining high accuracy.
Contribution
It proposes paired learning and inference algorithms that optimize feature ordering for rapid, accurate predictions, outperforming existing cascade methods in NLP tasks.
Findings
Over 5x reduction in runtime for POS tagging and parsing.
Maintains POS accuracy above 97% and parsing LAS above 88.5%.
NER F1 score exceeds 88 with doubled speed.
Abstract
We present paired learning and inference algorithms for significantly reducing computation and increasing speed of the vector dot products in the classifiers that are at the heart of many NLP components. This is accomplished by partitioning the features into a sequence of templates which are ordered such that high confidence can often be reached using only a small fraction of all features. Parameter estimation is arranged to maximize accuracy and early confidence in this sequence. Our approach is simpler and better suited to NLP than other related cascade methods. We present experiments in left-to-right part-of-speech tagging, named entity recognition, and transition-based dependency parsing. On the typical benchmarking datasets we can preserve POS tagging accuracy above 97% and parsing LAS above 88.5% both with over a five-fold reduction in run-time, and NER F1 above 88 with more than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
