TL;DR
This paper introduces a learnable Gray-Wyner inspired codec for vision tasks that effectively disentangles shared and task-specific information, reducing redundancy and improving performance across multiple benchmarks.
Contribution
It develops a novel three-channel codec framework based on lossy common information, bridging classical information theory with modern task-driven learning.
Findings
Reduces redundancy in multi-task vision representations
Outperforms independent coding across six benchmarks
Demonstrates practical benefits of Gray-Wyner theory in machine learning
Abstract
Many computer vision tasks share substantial overlapping information, yet conventional codecs tend to ignore this, leading to redundant and inefficient representations. The Gray-Wyner network, a classical concept from information theory, offers a principled framework for separating common and task-specific information. Inspired by this idea, we develop a learnable three-channel codec that disentangles shared information from task-specific details across multiple vision tasks. We characterize the limits of this approach through the notion of lossy common information, and propose an optimization objective that balances inherent tradeoffs in learning such representations. Through comparisons of three codec architectures on two-task scenarios spanning six vision benchmarks, we demonstrate that our approach substantially reduces redundancy and consistently outperforms independent coding.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
