Leveraging Organizational Resources to Adapt Models to New Data Modalities
Sahaana Suri, Raghuveer Chanda, Neslihan Bulut, Pradyumna Narayana,, Yemao Zeng, Peter Bailis, Sugato Basu, Girija Narlikar, Christopher Re, and, Abishek Sethi

TL;DR
This paper presents a method for organizations to leverage existing resources like knowledge bases and services to efficiently adapt machine learning models to new data modalities, significantly reducing development time.
Contribution
It introduces a framework that uses organizational resources to create a shared feature space, enabling cross-modal learning and faster model adaptation at scale.
Findings
Reduces model development time from months to days.
Enables cross-modal learning using organizational resources.
Validated on over 5 classification tasks at Google.
Abstract
As applications in large organizations evolve, the machine learning (ML) models that power them must adapt the same predictive tasks to newly arising data modalities (e.g., a new video content launch in a social media application requires existing text or image models to extend to video). To solve this problem, organizations typically create ML pipelines from scratch. However, this fails to utilize the domain expertise and data they have cultivated from developing tasks for existing modalities. We demonstrate how organizational resources, in the form of aggregate statistics, knowledge bases, and existing services that operate over related tasks, enable teams to construct a common feature space that connects new and existing data modalities. This allows teams to apply methods for training data curation (e.g., weak supervision and label propagation) and model training (e.g., forms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
