One Configuration to Rule Them All? Towards Hyperparameter Transfer in Topic Models using Multi-Objective Bayesian Optimization
Silvia Terragni, Ismail Harrando, Pasquale Lisena, Raphael Troncy,, Elisabetta Fersini

TL;DR
This paper explores multi-objective hyperparameter optimization for topic models, revealing conflicting objectives and the importance of dataset characteristics, and suggests potential transferability of hyperparameters across datasets.
Contribution
It introduces a multi-objective Bayesian optimization approach for tuning topic models and demonstrates the potential for hyperparameter transferability based on corpus features.
Findings
Conflicting objectives in hyperparameter tuning for topic models
Training corpus characteristics influence hyperparameter choices
Hyperparameters can potentially be transferred between datasets
Abstract
Topic models are statistical methods that extract underlying topics from document collections. When performing topic modeling, a user usually desires topics that are coherent, diverse between each other, and that constitute good document representations for downstream tasks (e.g. document classification). In this paper, we conduct a multi-objective hyperparameter optimization of three well-known topic models. The obtained results reveal the conflicting nature of different objectives and that the training corpus characteristics are crucial for the hyperparameter selection, suggesting that it is possible to transfer the optimal hyperparameter configurations between datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Text and Document Classification Technologies
MethodsLinear Layer · Attention Dropout · Adam · Linear Warmup With Linear Decay · WordPiece · Residual Connection · Softmax · Layer Normalization · Multi-Head Attention · Weight Decay
