Assessing the trade-off between prediction accuracy and interpretability for topic modeling on energetic materials corpora
Monica Puerto, Mason Kellett, Rodanthi Nikopoulou, Mark D. Fuge, Ruth, Doherty, Peter W. Chung, and Zois Boukouvalas

TL;DR
This paper investigates the balance between prediction accuracy and interpretability in topic modeling for energetics research, using various embedding methods and local explanations validated by domain experts.
Contribution
It introduces a comparative analysis of embedding methods and applies local interpretability techniques to energetics datasets, addressing domain-specific challenges.
Findings
Different embedding methods vary in accuracy and interpretability.
Local explanations help validate classifier decisions with domain experts.
The study provides insights into optimal trade-offs for energetics research.
Abstract
As the amount and variety of energetics research increases, machine aware topic identification is necessary to streamline future research pipelines. The makeup of an automatic topic identification process consists of creating document representations and performing classification. However, the implementation of these processes on energetics research imposes new challenges. Energetics datasets contain many scientific terms that are necessary to understand the context of a document but may require more complex document representations. Secondly, the predictions from classification must be understandable and trusted by the chemists within the pipeline. In this work, we study the trade-off between prediction accuracy and interpretability by implementing three document embedding methods that vary in computational complexity. With our accuracy results, we also introduce local interpretability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods
MethodsAttentive Walk-Aggregating Graph Neural Network
