A Rule-Based Methodology for Company Identification: Application to the Downstream Space Sector
Kenza Bousedra, Pierre Pelletier

TL;DR
This paper introduces a rule-based NER methodology to identify downstream space sector companies from press texts, successfully detecting 88 new companies and expanding the sector database by 33%, with adaptable guidelines for broader applications.
Contribution
The paper presents a novel rule-based NER approach specifically designed for identifying companies in the downstream space industry from textual data.
Findings
Detected 88 new downstream space companies
Enriched the sector database by 33%
Provided adaptable guidelines for future applications
Abstract
This paper proposes an original methodology based on Named Entity Recognition (NER) to identify companies involved in downstream space activities, i.e., companies that provide services or products exploiting data and technology from space. Using a rule-based approach, the method leverages a corpus of texts from digitized French press articles to extract company names related to the downstream space segment. This approach allowed the detection of 88 new downstream space companies, enriching the existing database of the sector by 33\%. The paper details the identification process and provides guidelines for future replications, applying the method to other geographic areas, or adapting it to other industries where new entrants are challenging to identify using traditional activity classifications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLaw, logistics, and international trade
