TL;DR
This paper uses the Unconstrained Feature Model to analyze neural multivariate regression, revealing how multi-task learning and target normalization strategies influence training performance in deep neural networks.
Contribution
It provides the first qualitative insights into neural multivariate regression using the UFM, demonstrating how multi-task models and target normalization affect training loss.
Findings
Multi-task models outperform single-task models with similar regularization.
Whitening and normalizing targets reduce training MSE when variance is below one.
Empirical results confirm UFM predictions about training performance improvements.
Abstract
The Unconstrained Feature Model (UFM) is a mathematical framework that enables closed-form approximations for minimal training loss and related performance measures in deep neural networks (DNNs). This paper leverages the UFM to provide qualitative insights into neural multivariate regression, a critical task in imitation learning, robotics, and reinforcement learning. Specifically, we address two key questions: (1) How do multi-task models compare to multiple single-task models in terms of training performance? (2) Can whitening and normalizing regression targets improve training performance? The UFM theory predicts that multi-task models achieve strictly smaller training MSE than multiple single-task models when the same or stronger regularization is applied to the latter, and our empirical results confirm these findings. Regarding whitening and normalizing regression targets, the UFM…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper comes with a very clear message. The theoretical results are clean and easy to digest. Also, please take a look at the summary.
My main problem with the paper is the incremental contribution. I am not personally familiar with the scope of this area to judge overall what is a useful contribution in multivariate regression and/or UMF. 1. Each theorem is a direct algebraic extension of existing UFM results rather than a new conceptual advance. No new analytical machinery or relaxation beyond the original UFM is introduced. This is still fine, but then, I also don't see the significance of the question asked in our overall u
Paper is written clearly
- The terminology used in this paper is misleading, it may lead a reader to think that conclusions are much stronger than they actually are. The paper uses interchangeably the words “multivariate” and “multi-task”, which are very different concepts. This work is about “multivariate”, not “multi-task”, so the term “multi-task” should be eliminated. - Theoretical contributions are very weak. It is quite obvious that a multivariate model does typically better than a univariate model, because univa
(1) Provides clear, mathematically grounded closed-form derivations for training MSE under the UFM, extending its use beyond classification to multivariate regression. (2) Establishes interpretable theoretical conditions (via eigenvalue analysis of target covariance) that directly predict when whitening or normalization benefits performance. (3) Validates theoretical insights with thorough empirical results across multiple architectures and datasets, including reinforcement learning and drivin
(1) Theoretical analysis is limited to training MSE and does not address generalization behavior or test-time dynamics, weakening practical relevance. (2) UFM assumptions (infinitely expressive features, linear last layer) are highly idealized; empirical validation does not fully justify their applicability to real networks. (3) The connection between UFM predictions and observed empirical gaps (e.g., systematic underestimation of MSE) is noted but not quantitatively analyzed or modeled.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
