Corpus sp{\'e}cialis{\'e} et ressource de sp{\'e}cialit{\'e}
Bernard Jacquemin (ISC, UMR 7044, GERIICO), Sabine Ploux (ISC)

TL;DR
This paper presents a method using a mathematical and statistical model to visualize and create specialized dictionaries based on semantic relations within a domain-specific corpus, enhancing semantic navigation and linguistic analysis.
Contribution
It introduces a novel approach combining the Semantic Atlas model with morpho-syntactic analysis to automatically generate domain-specific semantic resources.
Findings
Effective visualization of word senses based on corpus relations
Automatic creation of specialized dictionaries from syntactic relations
Potential applications in semantic navigation and diachronic language studies
Abstract
"Semantic Atlas" is a mathematic and statistic model to visualise word senses according to relations between words. The model, that has been applied to proximity relations from a corpus, has shown its ability to distinguish word senses as the corpus' contributors comprehend them. We propose to use the model and a specialised corpus in order to create automatically a specialised dictionary relative to the corpus' domain. A morpho-syntactic analysis performed on the corpus makes it possible to create the dictionary from syntactic relations between lexical units. The semantic resource can be used to navigate semantically - and not only lexically - through the corpus, to create classical dictionaries or for diachronic studies of the language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Lexicography and Language Studies
