Topic Modeling as Multi-Objective Contrastive Optimization
Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu

TL;DR
This paper introduces a multi-objective contrastive optimization framework for neural topic models that balances reconstruction and semantic generalization, leading to improved topic quality and diversity.
Contribution
It proposes a novel contrastive learning method for sets of topic vectors and formulates contrastive topic modeling as a multi-objective optimization problem to enhance model performance.
Findings
Improved topic coherence and diversity.
Better downstream task performance.
Consistent outperformance over baseline models.
Abstract
Recent representation learning approaches enhance neural topic models by optimizing the weighted linear combination of the evidence lower bound (ELBO) of the log-likelihood and the contrastive learning objective that contrasts pairs of input documents. However, document-level contrastive learning might capture low-level mutual information, such as word ratio, which disturbs topic modeling. Moreover, there is a potential conflict between the ELBO loss that memorizes input details for better reconstruction quality, and the contrastive loss which attempts to learn topic representations that generalize among input documents. To address these issues, we first introduce a novel contrastive learning method oriented towards sets of topic vectors to capture useful semantics that are shared among a set of input documents. Secondly, we explicitly cast contrastive topic modeling as a gradient-based…
Peer Reviews
Decision·ICLR 2024 poster
+ The key idea of this paper (i.e., learning low-level mutual information of neural topic models that optimize ELBO and contrastive learning together) is very well-motivated. + The usage of setwise contrastive topic modeling is reasonable. Casting it as a multi-task learning problem and adopting multi-objective optimization to find a Pareto solution are technically novel. + A comprehensive set of benchmark datasets and baselines are considered. The authors also perform detailed ablation studie
- Statistical significance tests are missing. It is unclear whether the gaps between the proposed model and baselines/ablation versions are statistically significant or not. In particular, some gaps in Tables 3 and 6 are quite subtle, and the variances of classification scores in Table 2 are unknown, therefore p-values should be reported. - Only automatic metrics (e.g., NPMI and TD) are used to evaluate topic quality. Although the authors also examine document classification as a downstream tas
1. This paper is well-organized and equations are clearly written. 2. Extensive experimental are performed and results show this method consistently presents high performance. 3. Codes are provided in supplementary materials to ensure reproducibility.
1. Since [1] also uses contrastive learning to capture useful semantics of topic vectors which is similar to the proposed method, this paper does not clearly compare with [1] and explain its novelty. 2. This paper omits important baselines. For example, [1] also presents great performance in this task but this paper does not compare with it in the experiments. Which contrastive learning method performs better? [1] Xiaobao Wu, Anh Tuan Luu, and Xinshuai Dong. Mitigating data sparsity for short
- The set-wise contrastive learning is new and effective for resolving low-level mutual information of neural topic models. - Formulating the contrastive learning as a multi-task learning problem and solving it by a multi-objective optimization algorithm is an interesting idea. - Experimental comparisons with state-of-the-art neural topic models that include recent ones such as WeTe demonstrate the effectiveness of the proposed moethod.
- The effectiveness of the ChatGPT-based data augmentation is unclear. - The formal definition of the contrastive loss is missing in the main text, while incomplete definition can be found in Algorithm 1. - The justification of the use of MaxPooling is unclear. - The authors seem to assume as if it is possible to find "the optimal" Pareto solution (a Pareto optimal solution with optimal balance), while there is no superiority or inferiority between Pareto optimal solutions.
Videos
Taxonomy
TopicsAdvanced Text Analysis Techniques · Big Data and Business Intelligence
MethodsSparse Evolutionary Training · Contrastive Learning
