Emerging categories in scientific explanations
Giacomo Magnifico, Eduard Barbu

TL;DR
This paper introduces a new dataset of human-like scientific explanations from biomedical literature, categorizing them into emerging explanation types to support AI understanding and generation of explanations.
Contribution
It provides a large-scale, annotated dataset of scientific explanations with multi-class categories, addressing the lack of datasets for human-like explanations in AI research.
Findings
Achieved a Krippendorf Alpha of 0.667 for 3-class annotation
Extracted explanation sentences from biomedical literature
Organized explanations into multi-class categories
Abstract
Clear and effective explanations are essential for human understanding and knowledge dissemination. The scope of scientific research aiming to understand the essence of explanations has recently expanded from the social sciences to machine learning and artificial intelligence. Explanations for machine learning decisions must be impactful and human-like, and there is a lack of large-scale datasets focusing on human-like and human-generated explanations. This work aims to provide such a dataset by: extracting sentences that indicate explanations from scientific literature among various sources in the biotechnology and biophysics topic domains (e.g. PubMed's PMC Open Access subset); providing a multi-class notation derived inductively from the data; evaluating annotator consensus on the emerging categories. The sentences are organized in an openly-available dataset, with two different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
