A Dynamic Self-Evolving Extraction System
Moin Amin-Naseri, Hannah Kim, Estevam Hruschka

TL;DR
The paper introduces DySECT, a self-evolving extraction system that iteratively improves its knowledge base and extraction accuracy through a closed-loop cycle involving LLMs and reasoning over structured knowledge.
Contribution
It presents a novel framework for dynamic, self-improving information extraction that combines knowledge base expansion, reasoning, and feedback to enhance extraction quality over time.
Findings
System effectively updates its knowledge base with domain-specific information.
Extraction accuracy improves through iterative feedback and reasoning.
The approach adapts to shifting terminology and emerging jargon.
Abstract
The extraction of structured information from raw text is a fundamental component of many NLP applications, including document retrieval, ranking, and relevance estimation. High-quality extractions often require domain-specific accuracy, up-to-date understanding of specialized taxonomies, and the ability to incorporate emerging jargon and rare outliers. In many domains--such as medical, legal, and HR--the extraction model must also adapt to shifting terminology and benefit from explicit reasoning over structured knowledge. We propose DySECT, a Dynamic Self-Evolving Extraction and Curation Toolkit, which continually improves as it is used. The system incrementally populates a versatile, self-expanding knowledge base (KB) with triples extracted by the LLM. The KB further enriches itself through the integration of probabilistic knowledge and graph-based reasoning, gradually accumulating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
