Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification
Ali Sakour, Zoalfekar Sakour

TL;DR
This paper improves HAL-based text classification by integrating a learnable attention mechanism that emphasizes important words, leading to significant accuracy gains and better interpretability over traditional mean pooling methods.
Contribution
It introduces an attention-based pooling method into HAL representations, enhancing sentence embeddings for text classification.
Findings
Achieved 82.38% accuracy on IMDB dataset.
Improved accuracy by 6.74 percentage points over mean pooling.
Enhanced interpretability by analyzing attention weights.
Abstract
The Hyperspace Analogue to Language (HAL) model relies on global word co-occurrence matrices to construct distributional semantic representations. While these representations capture lexical relationships effectively, aggregating them into sentence-level embeddings via standard mean pooling often results in information loss. Mean pooling assigns equal weight to all tokens, thereby diluting the impact of contextually salient words with uninformative structural tokens. In this paper, we address this limitation by integrating a learnable, temperature-scaled additive attention mechanism into the HAL representation pipeline. To mitigate the sparsity and high dimensionality of the raw co-occurrence matrices, we apply Truncated Singular Value Decomposition (SVD) to project the vectors into a dense latent space prior to the attention layer. We evaluate the proposed architecture on the IMDB…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Text and Document Classification Technologies
