Reconstruction of Protein-Protein Interaction Pathways by Mining Subject-Verb-Objects Intermediates
Maurice HT Ling, Christophe Lefevre, Kevin R. Nicholas, and Feng Lin

TL;DR
This paper presents Muscorian, a flexible text mining tool built on a generic processor, capable of extracting protein-protein interactions with high precision, demonstrating the effectiveness of a two-layered processing paradigm in biological literature analysis.
Contribution
It introduces Muscorian, a novel, adaptable text mining system that employs a two-layered generalization-specialization approach for extracting biological interactions without extensive tool modification.
Findings
Achieved 86-90% precision in extracting protein interactions
Demonstrated flexibility of the paradigm across multiple tasks
Comparable performance to specialized tools
Abstract
The exponential increase in publication rate of new articles is limiting access of researchers to relevant literature. This has prompted the use of text mining tools to extract key biological information. Previous studies have reported extensive modification of existing generic text processors to process biological text. However, this requirement for modification had not been examined. In this study, we have constructed Muscorian, using MontyLingua, a generic text processor. It uses a two-layered generalization-specialization paradigm previously proposed where text was generically processed to a suitable intermediate format before domain-specific data extraction techniques are applied at the specialization layer. Evaluation using a corpus and experts indicated 86-90% precision and approximately 30% recall in extracting protein-protein interactions, which was comparable to previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · Natural Language Processing Techniques
