TagRec++: Hierarchical Label Aware Attention Network for Question Categorization
Venktesh Viswanathan, Mukesh Mohania, Vikram Goyal

TL;DR
TagRec++ introduces a hierarchical label-aware attention network that reformulates question categorization as a dense retrieval task, effectively capturing semantic relatedness and handling class imbalance, outperforming existing methods.
Contribution
The paper proposes TagRec++, a novel hierarchical label-aware attention network that models labels as token compositions and employs cross-attention and adaptive negative sampling for improved question categorization.
Findings
Outperforms state-of-the-art methods on question datasets in Recall@k.
Demonstrates zero-shot capabilities and adaptability to label changes.
Effective in capturing semantic relatedness between content and hierarchical labels.
Abstract
Online learning systems have multiple data repositories in the form of transcripts, books and questions. To enable ease of access, such systems organize the content according to a well defined taxonomy of hierarchical nature (subject-chapter-topic). The task of categorizing inputs to the hierarchical labels is usually cast as a flat multi-class classification problem. Such approaches ignore the semantic relatedness between the terms in the input and the tokens in the hierarchical labels. Alternate approaches also suffer from class imbalance when they only consider leaf level nodes as labels. To tackle the issues, we formulate the task as a dense retrieval problem to retrieve the appropriate hierarchical labels for each content. In this paper, we deal with categorizing questions. We model the hierarchical labels as a composition of their tokens and use an efficient cross-attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text and Document Classification Technologies · Multimodal Machine Learning Applications
