ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision
Ming Zhong, Siru Ouyang, Minhao Jiang, Vivian Hu, Yizhu Jiao, Xuan, Wang, Jiawei Han

TL;DR
ReactIE is a novel weakly supervised method that leverages pattern-based cues and synthetic patent data to improve chemical reaction extraction from scientific literature, addressing data scarcity issues.
Contribution
ReactIE introduces a combined weak supervision approach using pattern cues and synthetic data, significantly enhancing chemical reaction extraction performance.
Findings
ReactIE outperforms all existing baselines.
Utilizes domain-specific synthetic data effectively.
Leverages linguistic cues for better extraction accuracy.
Abstract
Structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. Consequently, the scarcity of sufficient training data poses an obstacle to the progress of related models in this domain. In this paper, we propose ReactIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions. Additionally, we adopt synthetic data from patent records as distant supervision to incorporate domain knowledge into the model. Experiments demonstrate that ReactIE achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Text Analysis Techniques · Computational Drug Discovery Methods
