Latency-Aware Neural Architecture Search with Multi-Objective Bayesian   Optimization

David Eriksson; Pierce I-Jen Chuang; Samuel Daulton; Peng Xia; Akshat; Shrivastava; Arun Babu; Shicong Zhao; Ahmed Aly; Ganesh Venkatesh; Maximilian; Balandat

arXiv:2106.11890·cs.LG·June 29, 2021·5 cites

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat, Shrivastava, Arun Babu, Shicong Zhao, Ahmed Aly, Ganesh Venkatesh, Maximilian, Balandat

PDF

Open Access

TL;DR

This paper introduces a latency-aware neural architecture search method using multi-objective Bayesian optimization to efficiently balance model accuracy and on-device latency for large-scale natural language models.

Contribution

It applies advanced Bayesian optimization techniques to optimize neural architectures considering multiple objectives, specifically latency and accuracy, in a production environment.

Findings

01

Effective trade-off exploration between latency and accuracy.

02

Improved neural architecture search efficiency.

03

Demonstrated results on Facebook's natural language models.

Abstract

When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Bandit Algorithms Research · Advanced Neural Network Applications