Detecting multiword phrases in mathematical text corpora
Winfried G\"odert

TL;DR
This paper introduces a method for detecting multiword phrases in mathematical texts using characteristic features and a software tool, enhancing indexing and retrieval over traditional stemming methods.
Contribution
The paper presents a novel dictionary-based algorithmic approach for identifying multiword mathematical phrases, improving automatic indexing and information retrieval.
Findings
Effective identification of multiword phrases in mathematical texts.
Advantages over stemming procedures for indexing.
Discussion on applications for information retrieval.
Abstract
We present an approach for detecting multiword phrases in mathematical text corpora. The method used is based on characteristic features of mathematical terminology. It makes use of a software tool named Lingo which allows to identify words by means of previously defined dictionaries for specific word classes as adjectives, personal names or nouns. The detection of multiword groups is done algorithmically. Possible advantages of the method for indexing and information retrieval and conclusions for applying dictionary-based methods of automatic indexing instead of stemming procedures are discussed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Natural Language Processing Techniques
