Re-basin via implicit Sinkhorn differentiation
Fidel A. Guerrero Pe\~na, Heitor Rapela Medeiros, Thomas Dubail, Masih, Aminbeidokhti, Eric Granger, Marco Pedersoli

TL;DR
This paper introduces a differentiable Sinkhorn re-basin network that improves permutation finding in neural networks, enabling better incremental learning and mode connectivity, with competitive results on benchmark datasets.
Contribution
It presents a novel differentiable re-basin method using Sinkhorn optimization, facilitating integration into gradient-based training and enhancing continual learning capabilities.
Findings
Outperforms existing methods in permutation optimization tasks.
Enables incremental learning through a new cost function.
Achieves competitive results on benchmark datasets.
Abstract
The recent emergence of new algorithms for permuting models into functionally equivalent regions of the solution space has shed some light on the complexity of error surfaces, and some promising properties like mode connectivity. However, finding the right permutation is challenging, and current optimization techniques are not differentiable, which makes it difficult to integrate into a gradient-based optimization, and often leads to sub-optimal solutions. In this paper, we propose a Sinkhorn re-basin network with the ability to obtain the transportation plan that better suits a given objective. Unlike the current state-of-art, our method is differentiable and, therefore, easy to adapt to any task within the deep learning domain. Furthermore, we show the advantage of our re-basin method by proposing a new cost function that allows performing incremental learning by exploiting the linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
