TL;DR
This paper extends harmonic loss by exploring various distance metrics beyond Euclidean, demonstrating improved model performance, interpretability, and sustainability across vision and language tasks.
Contribution
It systematically evaluates non-Euclidean distance metrics in harmonic loss, revealing benefits in accuracy, interpretability, and environmental impact.
Findings
Cosine distances improve accuracy and reduce emissions in vision tasks.
Bray-Curtis and Mahalanobis enhance interpretability with varying efficiency.
Cosine-based harmonic losses stabilize training and improve representations in language models.
Abstract
Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is a distance-based alternative grounded in Euclidean geometry that improves interpretability and mitigates phenomena such as grokking, or delayed generalization on the test set. However, the study of harmonic loss remains narrow: only Euclidean distance is explored, and no systematic evaluation of computational efficiency or sustainability was conducted. We extend harmonic loss by systematically investigating a broad spectrum of distance metrics as replacements for the Euclidean distance. We comprehensively evaluate distance-tailored harmonic losses on both vision backbones and large language models. Our analysis is framed around a three-way…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
