Business Taxonomy Construction Using Concept-Level Hierarchical Clustering
Haodong Bai, Frank Z. Xing, Erik Cambria, Win-Bin Huang

TL;DR
This paper introduces an automatic method for constructing business taxonomies from corporate reports using hierarchical clustering, tailored for emerging markets and capable of discovering new terms with minimal supervision.
Contribution
It presents a novel concept-level hierarchical clustering approach for business taxonomy construction that adapts to emerging markets and identifies new industry terms.
Findings
Effective taxonomy construction for Chinese NEEQ market
Outperforms static taxonomies in reflecting new market features
Supports better understanding and investment in growth companies
Abstract
Business taxonomies are indispensable tools for investors to do equity research and make professional decisions. However, to identify the structure of industry sectors in an emerging market is challenging for two reasons. First, existing taxonomies are designed for mature markets, which may not be the appropriate classification for small companies with innovative business models. Second, emerging markets are fast-developing, thus the static business taxonomies cannot promptly reflect the new features. In this article, we propose a new method to construct business taxonomies automatically from the content of corporate annual reports. Extracted concepts are hierarchically clustered using greedy affinity propagation. Our method requires less supervision and is able to discover new terms. Experiments and evaluation on the Chinese National Equities Exchange and Quotations (NEEQ) market show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Data Mining Algorithms and Applications · Advanced Text Analysis Techniques
