Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation
Shudong Hao, Jordan Boyd-Graber, Michael J. Paul

TL;DR
This paper introduces a new intrinsic evaluation method for multilingual topic models that aligns with human judgments and improves assessment accuracy for low-resource languages.
Contribution
The paper presents a novel intrinsic evaluation metric for multilingual topic models and an adaptation model to enhance evaluation in low-resource language scenarios.
Findings
New evaluation metric correlates with human judgments.
Adaptation model improves metric reliability for low-resource languages.
Method enhances document analysis across multiple languages.
Abstract
Multilingual topic models enable document analysis across languages through coherent multilingual summaries of the data. However, there is no standard and effective metric to evaluate the quality of multilingual topics. We introduce a new intrinsic evaluation of multilingual topic models that correlates well with human judgments of multilingual topic coherence as well as performance in downstream applications. Importantly, we also study evaluation for low-resource languages. Because standard metrics fail to accurately measure topic quality when robust external resources are unavailable, we propose an adaptation model that improves the accuracy and reliability of these metrics in low-resource settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
