Similarity-Dissimilarity Loss for Multi-label Supervised Contrastive Learning
Guangming Huang, Yunfei Long, Cunjin Luo

TL;DR
This paper introduces a novel Similarity-Dissimilarity Loss for multi-label supervised contrastive learning, addressing positive sample ambiguity and improving performance across image, text, and medical data domains.
Contribution
It formulates multi-label relations, proposes a dynamic re-weighting loss, provides theoretical proofs, and unifies single-label and multi-label contrastive learning frameworks.
Findings
Outperforms baseline methods in diverse tasks
Achieves state-of-the-art on MIMIC-III-Full
Demonstrates robustness across modalities
Abstract
Supervised contrastive learning has achieved remarkable success by leveraging label information; however, determining positive samples in multi-label scenarios remains a critical challenge. In multi-label supervised contrastive learning (MSCL), multi-label relations are not yet fully defined, leading to ambiguity in identifying positive samples and formulating contrastive loss functions to construct the representation space. To address these challenges, we: (i) systematically formulate multi-label relations in MSCL, (ii) propose a novel Similarity-Dissimilarity Loss, which dynamically re-weights samples based on similarity and dissimilarity factors, (iii) further provide theoretical grounded proofs for our method through rigorous mathematical analysis that supports the formulation and effectiveness, and (iv) offer a unified form and paradigm for both single-label and multi-label…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
In this paper, a dynamic weighted loss function integrating similarity factor and difference factor is constructed to realize the unified paradigm of single/multi-label supervision and comparative loss, and the theoretical design has clear problem pertinence and academic value. Covering the multimodal data of "image-text-medical treatment" and considering variables such as "long tail distribution" in the medical field (MIMIC series), the experimental dimensions are comprehensive.
The selection basis of loss function superparameter is missing: the document sets the temperature parameter τ=0.07, but it does not explain why this value is suitable for multi-modal (image, text, medical) data, nor does it provide the sensitivity analysis of τ, so it is impossible to verify the robustness of superparameter selection. The ablation experiment is missing, and the value of key modules is not verified: the proposed loss function contains two core modules: similarity factor and diff
1. Clear problem and challenging definition: The paper clearly shows the challenges in MSCL problems and illustrate the weakness of baseline methods. Using the five-relation taxonomy is a clear way to demonstrate the problem. 2. Loss Function: The proposed loss function is a intuitive and proven to be effective both via theoretic proofing and experimental results. The core idea of separately accounting for similarity and dissimilarity is a novel and powerful concept. It directly addresses the
1. Insufficient Results & Analysis in Main Paper: The experimental validation in the main paper (Section 3) is too brief and lacks sufficient detail. The most critical comparison, the SOTA benchmark on MIMIC-III-Full (Table 6), is relegated to the appendix. The main paper should contain the strongest evidence of the method's efficacy, including key SOTA comparisons, to allow a reader to assess its performance without hunting through the appendix. 2. Limited Novelty: The paper's novelty appears
1. The paper provides a well-structured and comprehensive categorization of label relations in multi-label contrastive learning. 2. The writing is clear and adheres to academic conventions.
1. The paper tackles a relatively minor issue in multi-label contrastive learning—the inability to distinguish between R2 and R5—and existing multi-label contrastive methods can already address this. The prior work “Contrastive Learning for Multi-Label Classification” presents a conceptually similar solution: like the proposed similarity–dissimilarity factors, it operates by increasing the denominator of the positive-pair term. 2. The Lemma 1 presented in the paper is not applicable to multi-lab
1)It offers a clear and systematic formalization of multi-label relationships using set theory, providing a rigorous foundation for the field. 2) The proposed Similarity-Dissimilarity loss is both simple and interpretable, with a mathematically bounded design that facilitates stable training and analysis. 3) A major strength is the comprehensive theoretical proof, which grounds the method's properties and moves beyond mere empirical tuning. 3) The extensive cross-modal and cross-domain valida
1. Although this paper innovates in the definition of relations and the design of the S–D loss, existing work (e.g., [ref 1] using set operations to synthesize multi-label samples in feature space; [ref 2] proposing label-aware contrastive weighting; and [ref 3] integrating label hierarchies into supervised contrastive) has also explored leveraging label relations to improve representation learning at different levels. Authors are advised to clearly and concisely explain the differences between
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsContrastive Learning
