
TL;DR
This paper introduces YOTO, a method that automatically optimizes loss hyperparameters in a single training run using gradient-based methods, streamlining the training process and improving performance in computer vision tasks.
Contribution
YOTO is a novel approach that treats loss hyperparameters as learnable parameters, enabling one-shot optimization through differentiable composite loss modeling and regularization.
Findings
YOTO outperforms grid search in 3D estimation tasks.
YOTO achieves better generalization on unseen data.
The method is effective for semantic segmentation as well.
Abstract
The title of this paper is perhaps an overclaim. Of course, the process of creating and optimizing a learned model inevitably involves multiple training runs which potentially feature different architectural designs, input and output encodings, and losses. However, our method, You Only Train Once (YOTO), indeed contributes to limiting training to one shot for the latter aspect of losses selection and weighting. We achieve this by automatically optimizing loss weight hyperparameters of learned models in one shot via standard gradient-based optimization, treating these hyperparameters as regular parameters of the networks and learning them. To this end, we leverage the differentiability of the composite loss formulation which is widely used for optimizing multiple empirical losses simultaneously and model it as a novel layer which is parameterized with a softmax operation that satisfies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · 3D Shape Modeling and Analysis · Domain Adaptation and Few-Shot Learning
MethodsSoftmax
