Optimizing Non-decomposable Measures with Deep Networks
Amartya Sanyal, Pawan Kumar, Purushottam Kar, Sanjay Chawla, Fabrizio, Sebastiani

TL;DR
This paper introduces algorithms for directly training deep neural networks on complex, task-specific performance measures like F-measure and KL divergence, leading to faster, more stable training.
Contribution
It presents novel algorithms that optimize non-decomposable, structured loss functions directly within deep learning frameworks, improving convergence and efficiency.
Findings
Faster and more stable convergence across datasets.
Reduced training time and sample requirements.
Outperforms traditional and recent task-specific training methods.
Abstract
We present a class of algorithms capable of directly training deep neural networks with respect to large families of task-specific performance measures such as the F-measure and the Kullback-Leibler divergence that are structured and non-decomposable. This presents a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields much faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations have several novel features including (i) convergence to first order stationary points despite optimizing complex objective functions; (ii) use of fewer training samples to achieve a desired level of convergence, (iii) a substantial reduction in training time, and (iv) a seamless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
