Human-Centric Topic Modeling with Goal-Prompted Contrastive Learning and Optimal Transport

Rui Wang; Yi Zheng; Dongxin Wang; Haiping Huang; Yuanzhi Yao; Yuxiang Zhou; Jialin Yu; Philip Torr

arXiv:2604.12663·cs.AI·April 15, 2026

Human-Centric Topic Modeling with Goal-Prompted Contrastive Learning and Optimal Transport

Rui Wang, Yi Zheng, Dongxin Wang, Haiping Huang, Yuanzhi Yao, Yuxiang Zhou, Jialin Yu, Philip Torr

PDF

TL;DR

This paper introduces a human-centric topic modeling approach that integrates user goals into the process, using LLM prompts and optimal transport to produce more interpretable and goal-aligned topics.

Contribution

It proposes GCTM-OT, a novel method combining goal-prompted contrastive learning and optimal transport to enhance goal alignment and diversity in topic modeling.

Findings

01

GCTM-OT outperforms baselines in coherence and diversity.

02

It significantly improves alignment with human goals.

03

Experimental validation on subreddit datasets shows effectiveness.

Abstract

Existing topic modeling methods, from LDA to recent neural and LLM-based approaches, which focus mainly on statistical coherence, often produce redundant or off-target topics that miss the user's underlying intent. We introduce Human-centric Topic Modeling, \emph{Human-TM}), a novel task formulation that integrates a human-provided goal directly into the topic modeling process to produce interpretable, diverse and goal-oriented topics. To tackle this challenge, we propose the \textbf{G}oal-prompted \textbf{C}ontrastive \textbf{T}opic \textbf{M}odel with \textbf{O}ptimal \textbf{T}ransport (GCTM-OT), which first uses LLM-based prompting to extract goal candidates from documents, then incorporates these into semantic-aware contrastive learning via optimal transport for topic discovery. Experimental results on three public subreddit datasets show that GCTM-OT outperforms state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.