CHIMERA: A Knowledge Base of Scientific Idea Recombinations for Research Analysis and Ideation
Noy Sternlicht, Tom Hope

TL;DR
CHIMERA is a large-scale knowledge base of scientific idea recombinations mined from literature, enabling analysis of innovation patterns and training models for cross-disciplinary research suggestions.
Contribution
It introduces a new information extraction task, curates an expert-annotated dataset, and demonstrates applications in analyzing recombination patterns and generating research hypotheses.
Findings
Successfully extracted recombination instances from scientific papers.
Analyzed cross-disciplinary recombination patterns in AI.
Generated research ideas rated as inspiring by researchers.
Abstract
A hallmark of human innovation is recombination -- the creation of novel ideas by integrating elements from existing concepts and mechanisms. In this work, we introduce CHIMERA, the first large-scale Knowledge Base (KB) of recombination examples automatically mined from the scientific literature. CHIMERA enables empirical analysis of how scientists recombine concepts and draw inspiration from different areas, and enables training models that propose cross-disciplinary research directions. To construct this KB, we define a new information extraction task: identifying recombination instances in papers. We curate an expert-annotated dataset and use it to fine-tune an LLM-based extraction model, which we apply to a broad corpus of AI papers. We also demonstrate generalization to a biological domain. We showcase the utility of CHIMERA through two applications. First, we analyze patterns of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗noystl/llama-8b-e2emodel
- 🤗noystl/scibert_token_classifiermodel· 8 dl8 dl
- 🤗noystl/mistral_abstract_classifiermodel· 4 dl4 dl
- 🤗noystl/mistral-base-modelmodel· 7 dl7 dl
- 🤗noystl/mistral-abstract-cot-classifiermodel· 4 dl4 dl
- 🤗noystl/mistral-e2emodel· 1 dl1 dl
- 🤗noystl/recomb-pred-bge-large-enmodel· 2 dl2 dl
- 🤗noystl/recomb-pred-e5-largemodel· 2 dl2 dl
- 🤗noystl/recomb-pred-all-mpnet-basemodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
