Neural Multivariate Regression: Qualitative Insights from the Unconstrained Feature Model

George Andriopoulos; Soyuj Jung Basnet; Juan Guevara; Li Guo; Keith Ross

arXiv:2505.09308·cs.LG·October 1, 2025

Neural Multivariate Regression: Qualitative Insights from the Unconstrained Feature Model

George Andriopoulos, Soyuj Jung Basnet, Juan Guevara, Li Guo, Keith Ross

PDF

3 Reviews

TL;DR

This paper uses the Unconstrained Feature Model to analyze neural multivariate regression, revealing how multi-task learning and target normalization strategies influence training performance in deep neural networks.

Contribution

It provides the first qualitative insights into neural multivariate regression using the UFM, demonstrating how multi-task models and target normalization affect training loss.

Findings

01

Multi-task models outperform single-task models with similar regularization.

02

Whitening and normalizing targets reduce training MSE when variance is below one.

03

Empirical results confirm UFM predictions about training performance improvements.

Abstract

The Unconstrained Feature Model (UFM) is a mathematical framework that enables closed-form approximations for minimal training loss and related performance measures in deep neural networks (DNNs). This paper leverages the UFM to provide qualitative insights into neural multivariate regression, a critical task in imitation learning, robotics, and reinforcement learning. Specifically, we address two key questions: (1) How do multi-task models compare to multiple single-task models in terms of training performance? (2) Can whitening and normalizing regression targets improve training performance? The UFM theory predicts that multi-task models achieve strictly smaller training MSE than multiple single-task models when the same or stronger regularization is applied to the latter, and our empirical results confirm these findings. Regarding whitening and normalizing regression targets, the UFM…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

The paper comes with a very clear message. The theoretical results are clean and easy to digest. Also, please take a look at the summary.

Weaknesses

My main problem with the paper is the incremental contribution. I am not personally familiar with the scope of this area to judge overall what is a useful contribution in multivariate regression and/or UMF. 1. Each theorem is a direct algebraic extension of existing UFM results rather than a new conceptual advance. No new analytical machinery or relaxation beyond the original UFM is introduced. This is still fine, but then, I also don't see the significance of the question asked in our overall u

Reviewer 02Rating 2Confidence 3

Strengths

Paper is written clearly

Weaknesses

- The terminology used in this paper is misleading, it may lead a reader to think that conclusions are much stronger than they actually are. The paper uses interchangeably the words “multivariate” and “multi-task”, which are very different concepts. This work is about “multivariate”, not “multi-task”, so the term “multi-task” should be eliminated. - Theoretical contributions are very weak. It is quite obvious that a multivariate model does typically better than a univariate model, because univa

Reviewer 03Rating 4Confidence 3

Strengths

(1) Provides clear, mathematically grounded closed-form derivations for training MSE under the UFM, extending its use beyond classification to multivariate regression. (2) Establishes interpretable theoretical conditions (via eigenvalue analysis of target covariance) that directly predict when whitening or normalization benefits performance. (3) Validates theoretical insights with thorough empirical results across multiple architectures and datasets, including reinforcement learning and drivin

Weaknesses

(1) Theoretical analysis is limited to training MSE and does not address generalization behavior or test-time dynamics, weakening practical relevance. (2) UFM assumptions (infinitely expressive features, linear last layer) are highly idealized; empirical validation does not fully justify their applicability to real networks. (3) The connection between UFM predictions and observed empirical gaps (e.g., systematic underestimation of MSE) is noted but not quantitatively analyzed or modeled.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.