A system for information extraction from scientific texts in Russian
Elena Bruches, Anastasia Mezentseva, Tatiana Batura

TL;DR
This paper introduces a Russian scientific text information extraction system capable of recognizing terms, extracting relations, and linking entities without extensive labeled data, aiding various NLP applications.
Contribution
The system performs multiple extraction tasks end-to-end in Russian without large labeled datasets, suitable for low-resource environments.
Findings
Effective term recognition and relation extraction in Russian texts.
No large labeled datasets required, reducing resource needs.
Open-source implementation available for research use.
Abstract
In this paper, we present a system for information extraction from scientific texts in the Russian language. The system performs several tasks in an end-to-end manner: term recognition, extraction of relations between terms, and term linking with entities from the knowledge base. These tasks are extremely important for information retrieval, recommendation systems, and classification. The advantage of the implemented methods is that the system does not require a large amount of labeled data, which saves time and effort for data labeling and therefore can be applied in low- and mid-resource settings. The source code is publicly available and can be used for different research purposes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Text Analysis Techniques · Natural Language Processing Techniques
