Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja, Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher R\'e

TL;DR
This paper introduces a novel approach combining weak supervision with pre-trained embeddings to enable rapid, interactive training of machine learning models without extensive labeled data, achieving competitive performance efficiently.
Contribution
It proposes a new method that uses pre-trained embeddings to extend weak supervision sources, reducing training time and improving performance without fine-tuning embeddings.
Findings
Outperforms standard weak supervision by 4.1 points
Surpasses transfer learning without fine-tuning by 12.8 points
Achieves near state-of-the-art results with training in less than half a second
Abstract
Our goal is to enable machine learning systems to be trained interactively. This requires models that perform well and train quickly, without large amounts of hand-labeled data. We take a step forward in this direction by borrowing from weak supervision (WS), wherein models can be trained with noisy sources of signal instead of hand-labeled data. But WS relies on training downstream deep networks to extrapolate to unseen data points, which can take hours or days. Pre-trained embeddings can remove this requirement. We do not use the embeddings as features as in transfer learning (TL), which requires fine-tuning for high performance, but instead use them to define a distance function on the data and extend WS source votes to nearby points. Theoretically, we provide a series of results studying how performance scales with changes in source coverage, source accuracy, and the Lipschitzness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Speech Recognition and Synthesis
