IITK at the FinSim Task: Hypernym Detection in Financial Domain via   Context-Free and Contextualized Word Embeddings

Vishal Keswani; Sakshi Singh; Ashutosh Modi

arXiv:2007.11201·cs.CL·July 23, 2020·1 cites

IITK at the FinSim Task: Hypernym Detection in Financial Domain via Context-Free and Contextualized Word Embeddings

Vishal Keswani, Sakshi Singh, Ashutosh Modi

PDF

Open Access

TL;DR

This paper presents a hybrid approach using both context-free and contextualized embeddings for hypernym detection in the financial domain, achieving top rankings in the FinSim 2020 shared task.

Contribution

It introduces a combined method leveraging Word2vec and BERT embeddings with both supervised and unsupervised classifiers for financial hypernym detection.

Findings

01

System ranked 1st in the FinSim 2020 task

02

Combining embeddings improves classification accuracy

03

Supervised classifiers outperform unsupervised methods in this context

Abstract

In this paper, we present our approaches for the FinSim 2020 shared task on "Learning Semantic Representations for the Financial Domain". The goal of this task is to classify financial terms into the most relevant hypernym (or top-level) concept in an external ontology. We leverage both context-dependent and context-independent word embeddings in our analysis. Our systems deploy Word2vec embeddings trained from scratch on the corpus (Financial Prospectus in English) along with pre-trained BERT embeddings. We divide the test dataset into two subsets based on a domain rule. For one subset, we use unsupervised distance measures to classify the term. For the second subset, we use simple supervised classifiers like Naive Bayes, on top of the embeddings, to arrive at a final prediction. Finally, we combine both the results. Our system ranks 1st based on both the metrics, i.e., mean rank and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsLinear Layer · Residual Connection · Layer Normalization · Adam · Multi-Head Attention · Attention Dropout · Dropout · WordPiece · Weight Decay · Linear Warmup With Linear Decay