AHAM: Adapt, Help, Ask, Model -- Harvesting LLMs for literature mining
Boshko Koloski, Nada Lavra\v{c}, Bojan Cestnik, Senja Pollak, and Bla\v{z} \v{S}krlj, Andrej Kastrin

TL;DR
The paper introduces AHAM, a methodology that leverages large language models and domain expert input to improve scientific literature mining through adaptive topic modeling, reducing outliers and enhancing topic relevance.
Contribution
It presents a novel domain-adaptive framework combining LLaMa2 and expert-guided prompts for more accurate scientific text analysis and literature discovery.
Findings
AHAM effectively uncovers novel insights in scientific literature.
Domain adaptation improves topic modeling precision and reduces outliers.
Evaluation shows strong interaction between domain adaptation and topic quality.
Abstract
In an era marked by a rapid increase in scientific publications, researchers grapple with the challenge of keeping pace with field-specific advances. We present the `AHAM' methodology and a metric that guides the domain-specific \textbf{adapt}ation of the BERTopic topic modeling framework to improve scientific text analysis. By utilizing the LLaMa2 generative language model, we generate topic definitions via one-shot learning by crafting prompts with the \textbf{help} of domain experts to guide the LLM for literature mining by \textbf{asking} it to model the topic names. For inter-topic similarity evaluation, we leverage metrics from language generation and translation processes to assess lexical and semantic similarity of the generated topics. Our system aims to reduce both the ratio of outlier topics to the total number of topics and the similarity between topic definitions. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
