An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry
Stephen Meisenbacher, Tim Schopf, Weixin Yan, Patrick Holl, and, Florian Matthes

TL;DR
This paper introduces an enhanced class-specific keyword extraction method based on KeyBERT, tailored for German business registry data, significantly outperforming previous approaches in classifying economic sectors.
Contribution
The paper presents a novel improvement to KeyBERT for extracting class-specific keywords, specifically applied to German business registry entries for economic sector classification.
Findings
Significant performance improvement over previous methods
Effective identification of class-specific keywords
Set new standards for class-specific keyword extraction
Abstract
The task of is often an important initial step in unsupervised information extraction, forming the basis for tasks such as topic modeling or document classification. While recent methods have proven to be quite effective in the extraction of keywords, the identification of keywords, or only those pertaining to a predefined class, remains challenging. In this work, we propose an improved method for class-specific keyword extraction, which builds upon the popular library to identify only keywords related to a class described by . We test this method using a dataset of German business registry entries, where the goal is to classify each business according to an economic sector. Our results reveal that our method greatly improves upon previous approaches, setting a new standard for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques
MethodsLib
